Good idea. In meantime however, I 've dealt with it in another way. Whenever I detect there might be entries in the index that are about to be stored in duplicate, I perform a search for it first. If the hit count is not 0, I do not index. Else I do. If that makes sense... :)
Igor -----Oorspronkelijk bericht----- Van: Jokin Cuadrado [mailto:[EMAIL PROTECTED] Verzonden: dinsdag 4 maart 2008 13:01 Aan: lucene-net-user@incubator.apache.org Onderwerp: Re: Find unstored content you could use a term enumerator to get the term list of a field and get the last of them. If i remember well they are sorted alphabetically. -- jokin Jatorrizko mezua: ar., 2008-03-04 12:40 +0100, egilea: Igor Kalders > > > Does anyone have an idea if I would be able to tackle this _without_ > indexing all data and _without_ implementing a custom checkpoint file? > If I > can get this done, it would save me more than half of the index disk > space.