Daniel, You may want to look at SOLR-1375 which enables ID checking using a BloomFilter (with a specified errorrate of false positives). Otherwise for what you're trying to do, you'd need to create a hash map?
-J On Thu, Aug 13, 2009 at 7:33 AM, Daniel Shane<sha...@lexum.umontreal.ca> wrote: > Hi all! > > I'm currently running a big lucene index and one of my main concerns is the > integrity of the data entered. A few things come to mind, like enforcing > that certain fields be non-blank, forcing certain formats etc... > > All these validations are easy to do with lucene, since I can validate the > document before it is indexed or when it is retrieved. > > The thing however that I have a hard time with, is field uniquness. > > Lets say I have a field and I really want it to be unique. I can't seem to > find out how to do it during the indexation phase since everything that is > added to the index is not readable by an index reader until the index is > closed. > > Add to that the fact that items can be deleted from the index during the > indexation and the only way I have to figure uniquness is to check every > unique field values using termEnums and checking for docFreq. > > This has a major disadvantage that I cannot inform people who are using the > library of the unique conflit when it happens, only when the index is > closed. > > Does anyone have an idea on how I could check an index that is in the > process of being indexed (things added, things deleted) for the uniquess of > a given field *at the time I index a document* ? > > Daniel Shane > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org