On Thu, 05 May 2011, Johnny Mariéthoz wrote:
> Error when putting the term ''non-meat'' into db
> (hitlist=intbitset([22464])): (1062, "Duplicate entry '16777215' for
> key 1")

The duplicate entry problem is related to incremental indexing of badly
washed/truncated index terms before they are pushed to index.  It could
happen due to bad UTF-8 characters, due to change in work breaking
procedures, etc.  We have seen it too on our servers, mostly for
full-text indexing.

We believe we have fixed this problem in the latest git master branch;
but these fixes concern Invenio v1.0 release series only.  If you are on
v0.99 release series, then some back-porting may be needed.  Do you get
these troubles on RERO DOC running Invenio v0.99.1?

In any case, rebuilding all your indexes from scratch (via bibindex -R)
should fix the problem for some time to come, even without patching your
sources.  Because I think you see this problem only with incremental
indexing; it should not happen during full re-indexing.  Is that right?

Best regards
-- 
Tibor Simko

Reply via email to