If you want to stick with the approach of multiple indexes you'll have
to add some logic to work round it.
Option 1.
Post merge, loop through all docs identifying duplicates and deleting
the one(s) you don't want.
Option 2.
Pre merge, read all indexes in parallel, identifying and deleting as above.
Option 3.
When creating a new index, check the first and delete matches or don't
index the file, whichever makes sense.
I'm sure there are other options as well, but no instant solutions.
One obvious option is to skip the merging altogether: if you want one
big index, why not just work directly with that, using updateDocument
with filename as the Term.
--
Ian.
On Wed, Sep 11, 2013 at 1:40 PM, Ankit Murarka
ankit.mura...@rancoretech.com wrote:
Hello
Have a peculiar problem to deal with and I am sure there must be some way to
handle it.
1. Indexes exist on the server for existing files.
2. Generating indexing is automated so files when generated will also lead
to index generation.
3. I am merging the newly generated indexes and existing index.
/*Field of prime importance is fileName.*/
Now since merging is being done with /* writer.addIndexes(Directory name)*/
The same file if indexed again is being added in the indexes twice. So in
Hit I am getting more than 1 entries for same file. No problem with the
HIT..
Problem is with the same file being indexed two times during merging..
I need to ensure that when I merge indexes, if term say /*File1*/ is
already present, the indexes should be updated instead of adding. This is
supposed to happen during indexing process.
Kindly guide as to how it can be achieved.. Javadoc does not seem to help
me.
TIA.
--
Regards
Ankit Murarka
What lies behind us and what lies before us are tiny matters compared with
what lies within us
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org