Re: How do we limit the growth of a Lucene Index?

Grant Ingersoll Mon, 05 Nov 2007 04:13:57 -0800

You could search this list about distributing your indexes, etc.RemoteSearchable may be handy, but you will probably have to buildsome infrastructure around it for handling failover, etc. (would makefor a nice contribution)

How often do you think archived data will need to be accessed? Andhow much data are you talking? Seems to me like the main issue willbe in managing the searchers in light of having a lot of potentialindexes. Just thinking out loud, though.


-Grant

On Nov 4, 2007, at 1:48 PM, Sandeep Mahendru wrote:

Hi ,
We have been developing an enterprise logging service at theWachovia
bank. The logs (Busines, application, error) for all the bank related
applications are consolidated
at one single location in an Oracle 10g Database.
In our second phase, we are now building a high perforinmg reportviewer
over it. So our search algorithm does not go to the Oracle 10g DB. We
therfore avoid network and I/O.
Our serach algorith now goes to a LUCENE index. We have Lucene indexes
created for each application. These indexes are present on the samemachine,where the search algorithm runs. As more applications at the bankare now
beginning to consume this service, the Lucene Index is now growing.
One of my team leads has suggested the following approach to resolvethis
issue:
*I think the best approach is to restrict the Index size , is tokeep it forsome limited time and then archive the same. In case user wants tosearchagainst the old files then we might need to provide someconfiguration usingwhich the lucene searcher can point to the achieved file and searchthecontent. To implement this we need to rename the Index file withfrom and todate before its archived. While searching against the older files,user need
to provide the date range and then the app can point to the relevant
archived index files for search. Let me know your thoughts on this. *
**
At present this sounds the most logical to me. But then we begin tostore
the Lucene indexes on a diffferent machine. This might again cause the
search algorithm to make a network trip, if the serach is based on old
archived data.
Is there a better design to resolve the above concern. Does Luceneprovid
some sort of API to handle the above scenario's?

Regards,
Sandeep.


--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com

Lucene Boot Camp Training:
ApacheCon Atlanta, Nov. 12, 2007.  Sign up now!  http://www.apachecon.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How do we limit the growth of a Lucene Index?

Reply via email to