On Mon, Dec 11, 2017 at 10:59 PM, Adrien Grand <jpou...@gmail.com> wrote: > This means the FST builder is fed twice with the same key, so it tries to > merge their outputs. This should not happen since the terms dictionary > deduplicates terms. > > Do you get additional errors if you enable assertions? What are the codec > readers that you pass to addIndexes? Could they contain duplicate terms?
This hint is a good lead, I'll start by checking all our own reader implementations to see whether any of them could return the same term more than once. The index I have been given might be broken somehow too, but we're also migrating the data by creating "fake" codec readers for things like postings, so it could be literally anywhere at this point. Turns out I still don't get to find out which field did it yet either, because the most suspicious field didn't trigger it when migrated by itself, and my overnight attempt died for other reasons. :) TX --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org