Daniel,

You may want to look at SOLR-1375 which enables ID checking
using a BloomFilter (with a specified errorrate of false
positives). Otherwise for what you're trying to do, you'd need
to create a hash map?

-J

On Thu, Aug 13, 2009 at 7:33 AM, Daniel Shane<sha...@lexum.umontreal.ca> wrote:
> Hi all!
>
> I'm currently running a big lucene index and one of my main concerns is the
> integrity of the data entered. A few things come to mind, like enforcing
> that certain fields be non-blank, forcing certain formats etc...
>
> All these validations are easy to do with lucene, since I can validate the
> document before it is indexed or when it is retrieved.
>
> The thing however that I have a hard time with, is field uniquness.
>
> Lets say I have a field and I really want it to be unique. I can't seem to
> find out how to do it during the indexation phase since everything that is
> added to the index is not readable by an index reader until the index is
> closed.
>
> Add to that the fact that items can be deleted from the index during the
> indexation and the only way I have to figure uniquness is to check every
> unique field values using termEnums and checking for docFreq.
>
> This has a major disadvantage that I cannot inform people who are using the
> library of the unique conflit when it happens, only when the index is
> closed.
>
> Does anyone have an idea on how I could check an index that is in the
> process of being indexed (things added, things deleted) for the uniquess of
> a given field *at the time I index a document* ?
>
> Daniel Shane
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to