large term vectors

2008-02-08 Thread marc.dumontier
Hi, I have a large index which is around 275GB. As I search different parts of the index, the memory footprint grows with large byte arrays being stored. They never seem to get unloaded or GC'ed. Is there any way to control this behavior so that I can periodically unload cached information?

Re: large term vectors

2008-02-10 Thread Cedric Ho
Is it a single index ? My index is also in the 200G range, but I never managed to get a single index of size > 20G and still get acceptable performance (in both searching and updating). So I split my indexes into chunks of < 10G I am curious as to how you manage such a single large index. Cedric

Re: large term vectors

2008-02-10 Thread Briggs
So, I have a question about 'splitting indexes'. I see people say this all over, but how have people been handling this. I'm going to start a new thread, and there probably was one back in the day, but I am going to fire it up again. But, how did you do it? On Feb 10, 2008 9:18 PM, Cedric Ho <

Re: large term vectors

2008-02-11 Thread Karl Wettin
ay, February 11, 2008 7:46 AM To: java-user@lucene.apache.org Subject: Re: large term vectors Hi Marc, Can you give more info about what your field properties are? Your subject line implies you are storing term vectors, is that the case? Also, what version of Lucene are you using? Cheers, Gran

RE: large term vectors

2008-02-11 Thread marc.dumontier
..is there some way to optimize around this? Marc -Original Message- From: Grant Ingersoll [mailto:[EMAIL PROTECTED] Sent: Monday, February 11, 2008 7:46 AM To: java-user@lucene.apache.org Subject: Re: large term vectors Hi Marc, Can you give more info about what your field proper

RE: large term vectors

2008-02-11 Thread marc.dumontier
ms to have something to do with the norms (SegmentReader.norms) Marc -Original Message- From: Cedric Ho [mailto:[EMAIL PROTECTED] Sent: Sunday, February 10, 2008 9:19 PM To: java-user@lucene.apache.org Subject: Re: large term vectors Is it a single index ? My index is also in the 200G

Re: large term vectors

2008-02-11 Thread Grant Ingersoll
Hi Marc, Can you give more info about what your field properties are? Your subject line implies you are storing term vectors, is that the case? Also, what version of Lucene are you using? Cheers, Grant On Feb 8, 2008, at 10:51 AM, <[EMAIL PROTECTED]> <[EMAIL PROTECTED] > wrote: Hi,

Re: large term vectors

2008-02-10 Thread Cedric Ho
I guess it would be quite different for different apps. For me, I do index update on a single machine: index each incoming documents into one chunk according to some rule to ensure even distribution. Then copy all the updated indexes to some other machines for searching. Each machine will then reo