Hello Ivan,
That was cool news! Thanks! :) The timings are surprisingly good. 10 mln docs
sorted in 20s.. cool! Also it looks like sorting algorithm employed by Lucene is
quite memory-economic.
Not supporting multiple fields is in fact another limitation of my patch. I
don't need it so I didn't i
Hi Artem,
Thank you very much for your mails :)
So first I have to tell you that your patch works perfectly even with
very big indexes - 40 GB (you can see the results bellow).
The reason I to have bad test results last time is that I made a bit
change (but I can not understand why this change
Hi Ivan!
btw may be forbidding the sorted search in case of too many results is an
option? I did this way in my case.
Regards,
Artem.
On 4/24/07, Artem Vasiliev <[EMAIL PROTECTED]> wrote:
Ahhh, you said in your original post that your search matches _all_ the
results.. Yup my patch will not h
Ahhh, you said in your original post that your search matches _all_ the
results.. Yup my patch will not help much in this case - after all all the
values have to be read to be compared while sorting! :)
LUCENE-769 patch helps only if result set is significantly less than full
index size.
Regards
Hello Ivan!
It's so sad to me that you had bad results with that patch. :)
The discussion in the ticket is out-of-date - the patch was initially in
several classes, used WeakHashMap but then it evolved to what it's now - one
StoredFieldSortFactory class. I use it in my sharehound app in pretty m
Hi All,
THANK YOU FOR YOUR HELP :)
I put this problem in the forum but I had no chance to work on it last
week unfurtunately...
So now I tested the Artem's patch but the results show:
1) speed is very slow compare with the usage without patch
2) There are not very big differences of memory usage
, 2007 11:00:53 AM
Subject: Re: Out of memory exception for big indexes
: Would it be fair to say that you can expect OutOfMemory errors if you
: run complex queries? ie sorts, boosts, weights...
not intrinsicly ... the amount of memory used has more to do with the size
of hte index and the sortin
Hello Nilesh,
Sunday, April 8, 2007, 10:58:32 PM, you wrote:
[talkin' about LUCENE-769]
>> I must note that my patch only helps in lucene-OOM situations related to
>> _sorted_ queries. If this is your case than I think yes it will help.
NB> Probably a newbie question, but can you please explain
Hello Nilesh and all!
NB> This seems like a very useful patch. Our application searches over 50
NB> million doc in a 40GB index. We only have simple conjunctive queries
NB> on a single field. Currently, the command line search program that
NB> prints top-10 results requires at least 200mb memory.
It *is* a bit confusing, since every search is sorted, kinda
Practically, a sorted query is one where you call one of the search
methods (on, say, Searcher) with a Sort object, which sorts
on one or more of the fields in your index (which ones are
used are specified in the (array of) Sort obj
On 4/8/07, Artem <[EMAIL PROTECTED]> wrote:
I must note that my patch only helps in lucene-OOM situations related to
_sorted_ queries. If this is your case than I think yes it will help.
Probably a newbie question, but can you please explain what sorted
queries mean? Is simple keyword search a s
Hello Nilesh,
Sunday, April 8, 2007, 9:03:06 AM, you wrote:
NB> This seems like a very useful patch. Our application searches over 50
NB> million doc in a 40GB index. We only have simple conjunctive queries
NB> on a single field. Currently, the command line search program that
NB> prints top-10 r
This seems like a very useful patch. Our application searches over 50
million doc in a 40GB index. We only have simple conjunctive queries
on a single field. Currently, the command line search program that
prints top-10 results requires at least 200mb memory. Our web
application, that searches the
Hi Ivan, Chris and all!
I'm that contributor of LUCENE-769 and I recommend it too :)
OutOfMemory error was one of main reasons for me to make it.
Regards,
Artem Vasiliev
On 4/6/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: The problem I suspect is the sorting. As I understand, Lucene
: bu
t;
To: java-user@lucene.apache.org
Sent: Friday, April 6, 2007 1:10:36 PM
Subject: Re: Out of memory exception for big indexes
Would it be fair to say that you can expect OutOfMemory errors if you run
complex queries? ie sorts, boosts, weights...
My query looks like this:
+(pathNode
: The problem I suspect is the sorting. As I understand, Lucene
: builds internal caches for sorting and I suspect that this is the root
: of your problem. You can test this by trying your problem queries
: without sorting.
if Sorting really is the cause of your problems, you may want to try out
: Would it be fair to say that you can expect OutOfMemory errors if you
: run complex queries? ie sorts, boosts, weights...
not intrinsicly ... the amount of memory used has more to do with the size
of hte index and the sorting done then it does with teh number of clauses
in your query (of course
t;
To: java-user@lucene.apache.org
Sent: Friday, April 6, 2007 8:20:21 AM
Subject: Re: Out of memory exception for big indexes
Ivane,
Sorts will eat your memory, and how much they use depends on what you store in
them - ints, String, floats...
A profiler like JProfiler will tell you what's going on, who&
py -- http://www.simpy.com/ - Tag - Search - Share
- Original Message
From: Ivan Vasilev <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, April 6, 2007 7:09:38 AM
Subject: Out of memory exception for big indexes
Hi All,
I have the following problem - we have OutOfMemo
I can only shed a little light on a couple of points, see below.
On 4/6/07, Ivan Vasilev <[EMAIL PROTECTED]> wrote:
Hi All,
I have the following problem - we have OutOfMemoryException when
seraching on the indexes that are of size 20 - 40 GB and contain 10 - 15
million docs.
When we make searc
Hi All,
I have the following problem - we have OutOfMemoryException when
seraching on the indexes that are of size 20 - 40 GB and contain 10 - 15
million docs.
When we make searches we perform query that match all the results but we
DO NOT fetch all the results - we fetch 100 of them. We also
21 matches
Mail list logo