Re: Index size and performance degradation

Itamar Syn-Hershko Sun, 12 Jun 2011 11:52:02 -0700

Our problem is a bit different. There aren't always common searches soif we cache blindly we could end up having too much RAM allocated forvirtually nothing. And we need to allow for real-time search so cachingwill hardly help. We enforce some client-side caching, but again - thereal-time requirement is a bit of a problem...

I'm not sure I understood the filters approach you described. Can yougive an example?

So we'll have to sit and plan some strategies. Perhaps a Lucene API toallow prioritizing caching based on fields is going to prove useful insuch scenarios - so some application logic could make the search coreitself a bit more robust based on usage (or have Lucene learn on its own).

BTW, I don't actually recall reading about this topic anywhere, sothanks again for all the good advice and perhaps it worthy adding to LIA3 :)



Itamar.


On 12/06/2011 20:29, Shai Erera wrote:

Shai, what would you call a smart app-level cache? remembering frequent
searches and storing them handy?


Remembering frequent searches is good. If you do this, you can warm up the
cache whenever a new IndexSearcher is opened (e.g., if you use
SearcherManager from LIA2) and besides keeping the results 'ready', you also
keep them up-to-date.

Another thing to consider is session-level cache. This can have several
uses:
(1) In non-AJAX apps (probably not too many today, hopefully?), a page
reload can issue several queries to the backend, while usually only one
portion of the page gets updated, so caching the queries the user submitted
during his session will help here.
(2) If users interact w/ your web app by, e.g., repeating the same actions,
that will help. An example is a user who frequently clicks the "Back" and
"Forward" buttons in the browser. Although, there are client-side solutions
for that, using Dojo, which helps you store the 'previous' pages the user
visited.
(3) If your app is ACL-constrained, caching the user ACLs and their matching
docs will be very useful.

In another app I'm involved with there are several Filters that exist. Their
number varies between deployments, but for each they are fixed and known in
advance. Each Filter matches a different set of documents and queries are
always added one or more Filters. So we cache them (and warm them up when
opening new IndexSearchers) using CachingWrapperFilter, which is great since
it works at the segment level, so warming up is very fast usually.

Another cache, which is very high-level are pre-defined queries with their
matching result set. I.e., for queries like "my-company-name" you always
return a fixed N results, and only if the user pages through them, do you
run the actual query.

And the list can go on and on :).

Shai


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Index size and performance degradation

Reply via email to