Note: I have changed the title of this thread to match its content

I am currently facing a similar issue.  I am dealing with a large index
that is constantly used and needs to be updated on a daily basis.  For
fear of corruption I would rather rebuild the index each time,
performing tests against it before using it.  However the problem I am
having is switching in the old index without causing service
interruption.  As long as queries are being made against the index I am
running into locking issues with the index files, preventing me from
putting the new index in place. Any suggestions?

Thanks, 
Scott 
-----Original Message-----
From: Erick Erickson [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 20, 2006 7:59 AM
To: java-user@lucene.apache.org
Subject: Re: MultiFieldQueryParser doesn't properly filter out documents
when the query string specifies to exclude certain terms

My first question is how many documents would you be deleting on a pass
for
option 2? If it's 10 documents out of 10,000, I'd consider just deleting
them and re-adding (see IndexModifier).

Personally, if posible, I prefer your first option, building a
completely
new index and switching between them. This is especially useful if
something
catastrophic happens to the index as you build it and it winds up being
unusable (power failures *do* happen). You can keep using your old index
and
be happy.

Another question is how quickly the index builds and how soon do your
users
require that they get up-to-date data?

And remember that no matter what, you must re-open your searcher to see
the
updates.

I'd be really reluctant to remove all the items and re-build the index
for
several reasons...
1> You wouldn't get the new data being added until you closed/reopened
your
searcher.
2> The documents you deleted wouldn't be "gone" until you
closed/reopened
your searcher.
3> In the interim, your users wouldn't have access to much of
anything....

Best
Erick

On 12/20/06, Adam Fleming <[EMAIL PROTECTED]> wrote:
>
>
> Hello Gentlemen (+Ladies?),
>
> I'm integrating Lucene into a Spring web-app, and have found a
plethora of
> great web + print resources to make the integration quick and
seamless.  One
> thing that I have been hard-pressed to find is a good solution for
> rebuilding the index on a regular basis.
>
> I'm curious if a you know of a best-practice (or have found something
> personally that works) for rebuilding a Lucene Index w/o service
> interruptions.  The assumptions are a spring IOC container w/ an
> IndexFactory bean.  I have the project configured to work with both
> FSDirectory and RamDirectory implementations.   If you don't know
Spring,
> you are free to ignore the details - I'll adapt your comments to my
code :)
>
> So far I tried rebuilding the index on a regular schedule, but
foolishly
> only added duplicate documents to an existing index.
>
> Things I have considered are
> - Using two index directories, and rebuilding one while the other is
>    in use + switching when the rebuilt index is ready.  This would
>    cause the app to alternate between two indexes.
> - Using a single index, and iterating over the index entirely,
>    deleting documents 1 by 1 and re-adding them with fresh data
> - Using a single index, and deleting ALL the documents at once
>    and then adding them all back as quickly as possible.
>
>
> All of my proposed ideas seem fly in the face of Lucene's sipmlicity,
and
> I will be so thankful to be pointed in the right direction.
>
>
> Happy Holidays and  a big Thank You to the active list users,
>
>
> Adam Fleming
>
> _________________________________________________________________
> Try amazing new 3D maps
> http://maps.live.com/?wip=51
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to