Re[2]: Out of memory exception for big indexes

2007-04-25 Thread Artem
Hello Ivan, That was cool news! Thanks! :) The timings are surprisingly good. 10 mln docs sorted in 20s.. cool! Also it looks like sorting algorithm employed by Lucene is quite memory-economic. Not supporting multiple fields is in fact another limitation of my patch. I don't need it so I didn't i

Re: Out of memory exception for big indexes

2007-04-25 Thread Ivan Vasilev
Hi Artem, Thank you very much for your mails :) So first I have to tell you that your patch works perfectly even with very big indexes - 40 GB (you can see the results bellow). The reason I to have bad test results last time is that I made a bit change (but I can not understand why this change

Re: Out of memory exception for big indexes

2007-04-24 Thread Artem Vasiliev
Hi Ivan! btw may be forbidding the sorted search in case of too many results is an option? I did this way in my case. Regards, Artem. On 4/24/07, Artem Vasiliev <[EMAIL PROTECTED]> wrote: Ahhh, you said in your original post that your search matches _all_ the results.. Yup my patch will not h

Re: Out of memory exception for big indexes

2007-04-24 Thread Artem Vasiliev
Ahhh, you said in your original post that your search matches _all_ the results.. Yup my patch will not help much in this case - after all all the values have to be read to be compared while sorting! :) LUCENE-769 patch helps only if result set is significantly less than full index size. Regards

Re: Out of memory exception for big indexes

2007-04-24 Thread Artem Vasiliev
Hello Ivan! It's so sad to me that you had bad results with that patch. :) The discussion in the ticket is out-of-date - the patch was initially in several classes, used WeakHashMap but then it evolved to what it's now - one StoredFieldSortFactory class. I use it in my sharehound app in pretty m

Re: Out of memory exception for big indexes

2007-04-23 Thread Ivan Vasilev
Hi All, THANK YOU FOR YOUR HELP :) I put this problem in the forum but I had no chance to work on it last week unfurtunately... So now I tested the Artem's patch but the results show: 1) speed is very slow compare with the usage without patch 2) There are not very big differences of memory usage

Re: Out of memory exception for big indexes

2007-04-17 Thread Craig W Conway
, 2007 11:00:53 AM Subject: Re: Out of memory exception for big indexes : Would it be fair to say that you can expect OutOfMemory errors if you : run complex queries? ie sorts, boosts, weights... not intrinsicly ... the amount of memory used has more to do with the size of hte index and the sortin

Re[4]: Out of memory exception for big indexes

2007-04-09 Thread Artem
Hello Nilesh, Sunday, April 8, 2007, 10:58:32 PM, you wrote: [talkin' about LUCENE-769] >> I must note that my patch only helps in lucene-OOM situations related to >> _sorted_ queries. If this is your case than I think yes it will help. NB> Probably a newbie question, but can you please explain

Fwd: Re[2]: Out of memory exception for big indexes

2007-04-09 Thread Artem
Hello Nilesh and all! NB> This seems like a very useful patch. Our application searches over 50 NB> million doc in a 40GB index. We only have simple conjunctive queries NB> on a single field. Currently, the command line search program that NB> prints top-10 results requires at least 200mb memory.

Re: Re[2]: Out of memory exception for big indexes

2007-04-08 Thread Erick Erickson
It *is* a bit confusing, since every search is sorted, kinda Practically, a sorted query is one where you call one of the search methods (on, say, Searcher) with a Sort object, which sorts on one or more of the fields in your index (which ones are used are specified in the (array of) Sort obj

Re: Re[2]: Out of memory exception for big indexes

2007-04-08 Thread Nilesh Bansal
On 4/8/07, Artem <[EMAIL PROTECTED]> wrote: I must note that my patch only helps in lucene-OOM situations related to _sorted_ queries. If this is your case than I think yes it will help. Probably a newbie question, but can you please explain what sorted queries mean? Is simple keyword search a s

Re[2]: Out of memory exception for big indexes

2007-04-08 Thread Artem
Hello Nilesh, Sunday, April 8, 2007, 9:03:06 AM, you wrote: NB> This seems like a very useful patch. Our application searches over 50 NB> million doc in a 40GB index. We only have simple conjunctive queries NB> on a single field. Currently, the command line search program that NB> prints top-10 r

Re: Out of memory exception for big indexes

2007-04-07 Thread Nilesh Bansal
This seems like a very useful patch. Our application searches over 50 million doc in a 40GB index. We only have simple conjunctive queries on a single field. Currently, the command line search program that prints top-10 results requires at least 200mb memory. Our web application, that searches the

Re: Out of memory exception for big indexes

2007-04-06 Thread Bublic Online
Hi Ivan, Chris and all! I'm that contributor of LUCENE-769 and I recommend it too :) OutOfMemory error was one of main reasons for me to make it. Regards, Artem Vasiliev On 4/6/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : The problem I suspect is the sorting. As I understand, Lucene : bu

Re: Out of memory exception for big indexes

2007-04-06 Thread Otis Gospodnetic
t; To: java-user@lucene.apache.org Sent: Friday, April 6, 2007 1:10:36 PM Subject: Re: Out of memory exception for big indexes Would it be fair to say that you can expect OutOfMemory errors if you run complex queries? ie sorts, boosts, weights... My query looks like this: +(pathNode

Re: Out of memory exception for big indexes

2007-04-06 Thread Chris Hostetter
: The problem I suspect is the sorting. As I understand, Lucene : builds internal caches for sorting and I suspect that this is the root : of your problem. You can test this by trying your problem queries : without sorting. if Sorting really is the cause of your problems, you may want to try out

Re: Out of memory exception for big indexes

2007-04-06 Thread Chris Hostetter
: Would it be fair to say that you can expect OutOfMemory errors if you : run complex queries? ie sorts, boosts, weights... not intrinsicly ... the amount of memory used has more to do with the size of hte index and the sorting done then it does with teh number of clauses in your query (of course

Re: Out of memory exception for big indexes

2007-04-06 Thread Craig W Conway
t; To: java-user@lucene.apache.org Sent: Friday, April 6, 2007 8:20:21 AM Subject: Re: Out of memory exception for big indexes Ivane, Sorts will eat your memory, and how much they use depends on what you store in them - ints, String, floats... A profiler like JProfiler will tell you what's going on, who&

Re: Out of memory exception for big indexes

2007-04-06 Thread Otis Gospodnetic
py -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: Ivan Vasilev <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Friday, April 6, 2007 7:09:38 AM Subject: Out of memory exception for big indexes Hi All, I have the following problem - we have OutOfMemo

Re: Out of memory exception for big indexes

2007-04-06 Thread Erick Erickson
I can only shed a little light on a couple of points, see below. On 4/6/07, Ivan Vasilev <[EMAIL PROTECTED]> wrote: Hi All, I have the following problem - we have OutOfMemoryException when seraching on the indexes that are of size 20 - 40 GB and contain 10 - 15 million docs. When we make searc

Out of memory exception for big indexes

2007-04-06 Thread Ivan Vasilev
Hi All, I have the following problem - we have OutOfMemoryException when seraching on the indexes that are of size 20 - 40 GB and contain 10 - 15 million docs. When we make searches we perform query that match all the results but we DO NOT fetch all the results - we fetch 100 of them. We also