Re: Read large size index

m.harig Tue, 30 Jun 2009 05:31:15 -0700

Hi there,

On Tue, Jun 30, 2009 at 12:41 PM, m.harig<m.ha...@gmail.com> wrote:
>
> Thanks Simon ,
>
>          Its working now , thanks a lot , i've a doubt
>
>       i've got 30,000 pdf files indexed ,  but if i use the code which you
> sent , returns only 200 results , because am setting   TopDocs topDocs =
> searcher.search(query,200);  as i said if use Integer.MAX_VALUE , it
> returns
> java heap space error , even i can't use 300 ,
The Integer.MAX_VALUE was my fault. Internally lucene allocates an
array of the size n (searcher.search(query,n)) even if your query only
returns 1 document. This causes the OOM. Only get as many results as
you need!

In turn is iterating and loading of all those documents necessary?

no need to iterate all documents , i set searcher.search(query,10000) , am
getting the results , 

What is your usecase of lucene where you have to load 30k of
documents? You have to be aware of that if you load 30k docs you need
enough memory for them in you JVM. I have no idea how you index and
what you store in the index but 30k pdf with -Xmx128M is not much :)

is there any way to get the total hits from the index when i search for a
keyword? i mean i set TopDocCollector collector = new
TopDocCollector(10000); , so the results will not exceed more than 10k ,
what am asking is i need to display the total hits from the index , it might
be more than 10k , like google did ,  Results 1 - 10 of about 51,200 , can
you please tell me..

simon

-- 
View this message in context: 
http://www.nabble.com/Read-large-size-index-tp24251993p24271025.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: Read large size index

Reply via email to