Thanks for the quick reply :). For now, I'd settle with just storing cache values in soft references so at least the GC would be able to free up some space when it needs to.
I think I'll just try to override the default sorting mechanism by subclassing FieldSortedHitQueue. I'll let you know how it goes. If anyone else has any idea please do let me know. Regards. Pablo. markrmiller wrote: > > 20 fields on a huge index? Wow - not sure there is a ton you can do with > that...anyone have any suggestions for that one? Distributed should help > I suppose, but thats a lot of sort fields for a large index. > > If LUCENE-831 ever gets off the ground you will be able to change the > cache used, and possibly use something that spills over to disk. > > PabloS wrote: >> Hi, >> >> I'm having a similar problem with my application, although we are using >> lucene 2.3.2. The problem we have is that we are required to sort on most >> of >> the fields (20 at least). Is there any way of changing the cache being >> used? >> I can't seem to find a way, since the cache is being accessed using the >> FieldCache.DEFAULT static field.. >> >> Any tip would be appreciated, otherwise I'll have to start looking for a >> clustered solution like Todd. >> >> Thanks in advance. >> Pablo >> >> >> >> >> markrmiller wrote: >> >>> The term, terminfo, indexreader internals stuff is prob on the low end >>> compared to the size of your field caches (needed for sorting). If you >>> are sorting by String I think the space needed is 32 bits x number of >>> docs + an array to hold all of the unique terms. So checking 300 million >>> docs (I know you are actually breaking it up smaller than that, but for >>> example) and ignoring things like String chars being variable byte >>> lengths and storing the length, etc and randomly picking 50000 unique >>> terms at 6 chars per: >>> >>> 32 bits x 300000000 + 50000 x 6 x 16 bits to MB = 1 144.98138 megabytes >>> >>> Thats per field your sorting on. If you are sorting on an int field it >>> should be closer to 32 bits x num docs - shorts, 32 bits x num docs, >>> etc. >>> >>> So you have those field caches, plus the IndexReader terminfo, term >>> stuff, plus whatever RAM your app needs beyond Lucene. 4 gig might just >>> not *quite* cut it is my guess. >>> >>> Todd Benge wrote: >>> >>>> There's usually only a couple sort fields and a bunch of terms in the >>>> various indices. The terms are user entered on various media so the >>>> number of terms is very large. >>>> >>>> Thanks for the help. >>>> >>>> Todd >>>> >>>> >>>> >>>> On 10/29/08, Todd Benge <[EMAIL PROTECTED]> wrote: >>>> >>>> >>>>> Hi, >>>>> >>>>> I'm the lead engineer for search on a large website using lucene for >>>>> search. >>>>> >>>>> We're indexing about 300M documents in ~ 100 indices. The indices add >>>>> up to ~ 60G. >>>>> >>>>> The indices are sorted into 4 different Multisearcher with the largest >>>>> handling ~50G. >>>>> >>>>> The code is basically like the following: >>>>> >>>>> private static MultiSearcher searcher; >>>>> >>>>> public void init(File files) { >>>>> >>>>> IndexSearcer [] searchers = new IndexSearcher[files.length] (); >>>>> int i = 0; >>>>> for ( File file: files ) { >>>>> searchers[i++] = new >>>>> IndexSearcher(FSDirectory.getDirectory(file); >>>>> } >>>>> >>>>> searcher = new MultiSearcher(searchers); >>>>> } >>>>> >>>>> public Searcher getSearcher() { >>>>> return searcher; >>>>> } >>>>> >>>>> We're seeing a high cache rate with Term & TermInfo in Lucene 2.4. >>>>> Performance is good but servers are consistently hanging with >>>>> OutOfMemory errors. >>>>> >>>>> We're allocating 4G in the heap to each server. >>>>> >>>>> Is there any way to control the amount of memory Lucene consume for >>>>> caching? Any other suggestions on fixing the memory errors? >>>>> >>>>> Thanks, >>>>> >>>>> Todd >>>>> >>>>> >>>>> >>>> >>>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>> For additional commands, e-mail: [EMAIL PROTECTED] >>> >>> >>> >>> >> >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- View this message in context: http://www.nabble.com/OutOfMemory-Problems-Lucene-2.4---Tomcat-tp20236834p20264511.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]