Re: Lucene coreClosedListeners memory issues

2019-06-04 Thread Adrien Grand
drien Grand > > <mailto:jpou...@gmail.com> wrote > > > > It looks like you are leaking readers. > > > > On Mon, Jun 3, 2019 at 9:46 AM alex stark > > <mailto:alex.st...@zoho.com.invalid> wrote: > > > > > >

Re: Lucene coreClosedListeners memory issues

2019-06-03 Thread alex stark
on, Jun 3, 2019 at 9:46 AM alex stark > <mailto:alex.st...@zoho.com.invalid> wrote: > > > > Hi experts, > > > > > > > > I recently have memory issues on Lucene. By checking heap dump, most of > > them are occupied by SegmentCoreReaders.coreC

Re: Lucene coreClosedListeners memory issues

2019-06-03 Thread Adrien Grand
wrote > > It looks like you are leaking readers. > > On Mon, Jun 3, 2019 at 9:46 AM alex stark wrote: > > > > Hi experts, > > > > > > > > I recently have memory issues on Lucene. By checking heap dump, most of > > them are occ

Re: Lucene coreClosedListeners memory issues

2019-06-03 Thread alex stark
e: > > Hi experts, > > > > I recently have memory issues on Lucene. By checking heap dump, most of them > are occupied by SegmentCoreReaders.coreClosedListeners which is about nearly > half of all. > > > > > > Dominator Tree==

Re: Lucene coreClosedListeners memory issues

2019-06-03 Thread Adrien Grand
It looks like you are leaking readers. On Mon, Jun 3, 2019 at 9:46 AM alex stark wrote: > > Hi experts, > > > > I recently have memory issues on Lucene. By checking heap dump, most of them > are occupied by SegmentCoreReaders.coreClosedListeners which is about

Lucene coreClosedListeners memory issues

2019-06-03 Thread alex stark
Hi experts, I recently have memory issues on Lucene. By checking heap dump, most of them are occupied by SegmentCoreReaders.coreClosedListeners which is about nearly half of all. Dominator Tree num retain size(bytes) percent percent(live) class Name

RE: Avoid memory issues when indexing terms with multiplicity

2014-04-07 Thread Dávid Nemeskey
Hi Uwe, thanks for your reply, too. :) I must admit that I was ahead of myself in the mail a bit, because I am not using a TokenFilter yet, but expand the tokens manually before sending them to Lucene. It is good to know that it makes a difference. I will definitely try the TokenStream-based solu

Re: Avoid memory issues when indexing terms with multiplicity

2014-04-07 Thread Dávid Nemeskey
Hi Greg, thanks for the reply. We used #1 before, but we want to get rid of positions in our index, they had a very noticable effect on the performance. As for #2: I was looking for something like this, thanks! Now the only question is how do I do it. :) Can I specify what TermsConsumer to use, t

RE: Avoid memory issues when indexing terms with multiplicity

2014-04-04 Thread Uwe Schindler
Hi, > The use-case is that some of the fields in the document are made up of > term:frequency pairs. What I am doing right now is to expand these with a > TokenFilter, so that for e.g. "dog:3 cat:2", I return "dog dog dog cat cat", > and > index that. However, the problem is that when these field

Re: Avoid memory issues when indexing terms with multiplicity

2014-04-04 Thread Gregory Dearing
Hi David, I'm not an expert, but I've climbed through the consumers myself in the past. The big limit is that the full postings for a document or document block must fit into memory. There may be other hidden processing limits (ie. memory used per-field). I think it would be possible to create

Avoid memory issues when indexing terms with multiplicity

2014-04-04 Thread Dávid Nemeskey
Hi guys, I have just recently (re-)joined the list. I have an issue with indexing; I hope someone can help me with it. The use-case is that some of the fields in the document are made up of term:frequency pairs. What I am doing right now is to expand these with a TokenFilter, so that for e.g. "do

Re: Memory issues with Lucene deployment

2012-09-27 Thread Paul Taylor
On 25/09/2012 20:09, Uwe Schindler wrote: Hi, Without a full output of "free -h" we cannot say anything. But the total Linux memory use should always used by 100% on a good server otherwise it's useless (because full memory includes cache usage, too). I think, -Xmx may be too less for your Jav

RE: Memory issues with Lucene deployment

2012-09-25 Thread Uwe Schindler
men http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Paul Taylor [mailto:paul_t...@fastmail.fm] > Sent: Tuesday, September 25, 2012 8:49 PM > To: java-user@lucene.apache.org >> "'java-user@lucene.apache.org'" > Subject: Memory i

Memory issues with Lucene deployment

2012-09-25 Thread Paul Taylor
Doing Lucene search within a jetty servlet container, the machine has 16gb of memory. Using 64bit JVM and Lucene 3.6 and files are memory mapped so I just allocate a max of 512mb to jetty itself, understanding that the remaining memory can be used to memory map lucene files. Monitoring total

Re: Memory issues

2011-09-05 Thread Toke Eskildsen
On Sat, 2011-09-03 at 20:09 +0200, Michael Bell wrote: > To be exact, there are about 300 million documents. This is running on a 64 > bit JVM/64 bit OS with 24 GB(!) RAM allocated. How much memory is allocated to the JVM? > Now, their searches are working fine IF you do not SORT the results. If

Re: Memory issues

2011-09-05 Thread Stefan Trcek
Michael Bell wrote: > How best to diagnose? > >> Call your java process this way >>java -XX:HeapDumpPath=. -XX:+HeapDumpOnOutOfMemoryError >> and drag'n'drop the resulting java_pid*.hprof into eclipse. >> You will get an outline by class for the number and size of allocated >> objects. Just lo

Re: Memory issues

2011-09-05 Thread Stefan Trcek
On Saturday 03 September 2011 20:09:54 Michael Bell wrote: > 2011-08-30 13:01:31,489 [TP-Processor8] ERROR > com.gwava.utils.ServerErrorHandlerStrategy - reportError: > nastybadthing :: > com.gwava.indexing.lucene.internal.LuceneSearchController.performSear >chOperation:229 :: EXCEPTION : java.lang

RE: Memory issues

2011-09-03 Thread Uwe Schindler
-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael Bell [mailto:m...@gwava.com] > Sent: Saturday, September 03, 2011 8:10 PM > To: java-user@lucene.apache.org > Subject: Memory issues > > Ok, one customer of ours insists on

Memory issues

2011-09-03 Thread Michael Bell
Ok, one customer of ours insists on running a Really Big Single Server Lucene index. To be exact, there are about 300 million documents. This is running on a 64 bit JVM/64 bit OS with 24 GB(!) RAM allocated. Until very recently all was well. We then updated then in program version, which updat

Re: heap memory issues when sorting by a string field

2009-12-17 Thread Michael McCandless
I think this'd make a nice contribution -- eg it could be bundled up as a FieldComparator impl, eg LowMemoryStringComparator, that would compute the global ords in multiple passes with limited RAM usage. It'd give users the space/time tradeoff... Mike On Mon, Dec 14, 2009 at 9:09 AM, Toke Eskild

Re: heap memory issues when sorting by a string field

2009-12-14 Thread Toke Eskildsen
On Fri, 2009-12-11 at 14:53 +0100, Michael McCandless wrote: > How long does Lucene take to build the ords for the toplevel reader? > > You should be able to just time FieldCache.getStringIndex(topLevelReader). > > I think your 8.5 seconds for first Lucene search was with the > StringIndex compute

Re: heap memory issues when sorting by a string field

2009-12-11 Thread Michael McCandless
. The order-array is updated for the documents that >> > has >> > one of these terms. The sliding is repeated multiple times, where terms >> > ordered >> > before the last term of the previous iteration are ignored. >> > >> > Cons: _Very_ slow (too slow in the current implementation) order bu

Re: heap memory issues when sorting by a string field

2009-12-11 Thread Toke Eskildsen
t; > > > Cons: _Very_ slow (too slow in the current implementation) order build. > > Pros: Same as above. > > Joker: The buffer size determines memory use vs. order build time. > > > > > > The multipass approach looks promising, but requires more work to get

Re: heap memory issues when sorting by a string field

2009-12-10 Thread Toke Eskildsen
> > Joker: The buffer size determines memory use vs. order build time. > > > > > > The multipass approach looks promising, but requires more work to get to a > > usable state. Right now it takes minutes to build the order-array for half a > > million documents, with a buffer size req

Re: heap memory issues when sorting by a string field

2009-12-10 Thread Michael McCandless
On Thu, Dec 10, 2009 at 2:05 AM, Ganesh wrote: > I think, This problem will happen for all sorted fields. I am sorting on > integer field. Integer field should take much less RAM than String, today, for sorting. And there's no efficiency gained by doing this globally (per segment is just fine).

Re: heap memory issues when sorting by a string field

2009-12-10 Thread Michael McCandless
r-array for half a > million documents, with a buffer size requiring 5 iterations. If I ever get > it to > work, I'll be sure to share it. > > Regards, > Toke Eskildsen > > > From: TCK [moonwatcher32...@gmail.com] > Sent: 09 December 20

Re: heap memory issues when sorting by a string field

2009-12-09 Thread Ganesh
e to share it. Regards, Toke Eskildsen From: TCK [moonwatcher32...@gmail.com] Sent: 09 December 2009 22:58 To: java-user@lucene.apache.org Subject: Re: heap memory issues when sorting by a string field Thanks Mike for opening this jira ticket and for your p

RE: heap memory issues when sorting by a string field

2009-12-09 Thread Toke Eskildsen
rds, Toke Eskildsen From: TCK [moonwatcher32...@gmail.com] Sent: 09 December 2009 22:58 To: java-user@lucene.apache.org Subject: Re: heap memory issues when sorting by a string field Thanks Mike for opening this jira ticket and for your patch. Explicitly removing the entry from the

Re: heap memory issues when sorting by a string field

2009-12-09 Thread Michael McCandless
gt;> something) >> >>> ? >> >>> >> >>> Thanks again, >> >>> TCK >> >>> >> >>> >> >>> >> >>> >> >>> On Mon, Dec 7, 2009 at 4:37 PM, Erick Erickson < >> erickerick...

Re: heap memory issues when sorting by a string field

2009-12-09 Thread TCK
're not really closing your > >>> > readers even though you think you are. Sorting indeed uses up > >>> > significant memory when it populates internal caches and keeps > >>> > it around for later use (which is one of the reasons that warming > >>&g

Re: heap memory issues when sorting by a string field

2009-12-08 Thread Michael McCandless
se (which is one of the reasons that warming >>> > queries matter). But if you really do close the reader, I'm pretty >>> > sure the memory should be GC-able. >>> > >>> > One thing that trips people up is IndexReader.reopen(). If it >>&

Re: heap memory issues when sorting by a string field

2009-12-08 Thread Michael McCandless
x27;t be returne An example from the Javadocs... >> > >> >  IndexReader reader = ... >> >  ... >> >  IndexReader new = r.reopen(); >> >  if (new != reader) { >> >   ...     // reader was reopened >> >   reader.close(); >> >  } >&

Re: heap memory issues when sorting by a string field

2009-12-07 Thread Jason Rutherglen
t;> > >> TCK >> > >> >> > >> >> > >> >> > >> >> > >> On Mon, Dec 7, 2009 at 4:37 PM, Erick Erickson < >> erickerick...@gmail.com >> > >> >wrote: >> > >> >> > >

Re: heap memory issues when sorting by a string field

2009-12-07 Thread Jason Rutherglen
ps >> >> > it around for later use (which is one of the reasons that warming >> >> > queries matter). But if you really do close the reader, I'm pretty >> >> > sure the memory should be GC-able. >> >> > >> >> > One thing that trips people u

Re: heap memory issues when sorting by a string field

2009-12-07 Thread TCK
that you're not really closing your > > >> > readers even though you think you are. Sorting indeed uses up > > >> > significant memory when it populates internal caches and keeps > > >> > it around for later use (which is one of the reasons th

Re: heap memory issues when sorting by a string field

2009-12-07 Thread Tom Hill
ons that warming > >> > queries matter). But if you really do close the reader, I'm pretty > >> > sure the memory should be GC-able. > >> > > >> > One thing that trips people up is IndexReader.reopen(). If it > >> > returns a reader

Re: heap memory issues when sorting by a string field

2009-12-07 Thread Jason Rutherglen
Reader.reopen(). If it >> > returns a reader different than the original, you *must* close the >> > old one. If you don't, the old reader is still hanging around and >> > memory won't be returne An example from the Javadocs... >> > >> >  IndexReader reade

Re: heap memory issues when sorting by a string field

2009-12-07 Thread Tom Hill
> ... > > IndexReader new = r.reopen(); > > if (new != reader) { > > ... // reader was reopened > > reader.close(); > > } > > reader = new; > > ... > > > > > > If this is irrelevant, could you post your close/open

Re: heap memory issues when sorting by a string field

2009-12-07 Thread TCK
gt; IndexReader new = r.reopen(); > if (new != reader) { > ... // reader was reopened > reader.close(); > } > reader = new; > ... > > > If this is irrelevant, could you post your close/open > > code? > > HTH > > Erick > > > On Mon,

Re: heap memory issues when sorting by a string field

2009-12-07 Thread Erick Erickson
M, TCK wrote: > Hi, > I'm having heap memory issues when I do lucene queries involving sorting by > a string field. Such queries seem to load a lot of data in to the heap. > Moreover lucene seems to hold on to references to this data even after the > index reader has been closed

heap memory issues when sorting by a string field

2009-12-07 Thread TCK
Hi, I'm having heap memory issues when I do lucene queries involving sorting by a string field. Such queries seem to load a lot of data in to the heap. Moreover lucene seems to hold on to references to this data even after the index reader has been closed and a full GC has been run. Some o

Memory Issues

2006-01-17 Thread Rob Young
Hi, I've developed a service which accepts search requests over the network, runs them with Lucene and pumps out results. I have noticed that if I use RAMDirectory the memory usage is much more (more than expected) and it grows as the service is left running. The lucene index is 34Mb but when