Hi Erik, excuse me for all my questions. Thank you very much for your speedy answers, and sorry for my bad english.
I am spanish and I don´t speak english very well.
Well, I have one question more.
Finally I am using IndexReader to return all the documents:
               Directory directory = FSDirectory.getDirectory(path, false);
               IndexReader reader = IndexReader.open(directory);
       for (int start = base; start < end; start++) {
           Document doc = reader.document(start);
String id=doc.get(es.seinet.xtent.searchEngine.lucene.general.Util.ID);
           ides.add(id);
       }
It works fine and speedy. The only problem is that it is impossible to sort the results by some metadata (gets all the documents order by title, for example).

My question is about the parameter maxClauseCount. I think the same that you. It is not a good idea bump up the limit...
If I use the default vale (1024) and I search, I am getting this error:
[SearchCollection,executeQuery] caught a class org.apache.lucene.search.BooleanQuery$TooManyClauses
with message: null

Are there any way to search all the documents (210.000 documents) and internally works only with 1024, returns documents until 1024 and not get the toomanyclauses error??? I need to work efficiently with collections of more than 250.000 regitries, and the users normally does complex querys (ej: DATE:[20050601 to 20050701] AND TITLE:Lucene* ...... ect....)

Ah!! I have seen that you are Erik Hatcher, the author of Lucene In Action!!! I don´t understand you about the filter.... well, I will read the charter of filtering a search :-D

Thanks in advance

       Mari Luz

----- Original Message ----- From: "Erik Hatcher" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Thursday, July 07, 2005 5:53 PM
Subject: Re: OUTOFMEMORY ERROR



On Jul 7, 2005, at 9:40 AM, MariLuz Elola wrote:
Thanks Erik,
I was wrong, exactly the query that throws an OutOfMemory error is ==> ID:0* -ID:xtent. With the query ID:0* I have tried to reproduce the error, but the exception doen´t appear.

Other thing, when the user searchs without using any query, internally I am creating the next query ==> ID:0* OR NOT ID:xtent.

That's a hairy query.  I definitely do not recommend doing something
like that with prefix queries.  Check out using a Filter for some of
this sort of thing also.

And this query parsed by QueryParser I am obtaining ID:0* -ID:xtent (traslated ==> ID:0* AND NOT ID:xtent), isn´t? Is QueryParser working wrong???

It depends.  By default, QueryParser uses OR as the default operator.

About maxClauseCount (by default 1024), I am setting this property:
org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.s earchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;

Bumping up that limit is not necessarily the best thing to do - I
recommend changing your approach to querying all documents rather
than trying to make BooleanQuery happy with an enormously inefficient
query.

    Erik



   Mari Luz

----- Original Message ----- From: "Erik Hatcher" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Thursday, July 07, 2005 2:46 PM
Subject: Re: OUTOFMEMORY ERROR



On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote:

The query is ==> ID:0*
This query returns all the documents, exactly 210.000 documents.
If the user doesn´t specify any criterio in the user interface of searching, the server searchs all the documents.


Doing a prefix query (which ID:0* is) internally builds a
BooleanQuery OR'ing all unique terms in the ID field that begin with
a "0".  The built in limit is 1,024 clauses in a BooleanQuery.

You will need to re-think your approach.  If the goal is to return
all documents, then use IndexReader to walk them.  If the goal is to
have a general user query expression where ID:0* would be entered you
will need to account for that possibility with more system resources
and bumping up the BooleanQuery limit or indexing differently so that
there are no so many terms being put into the BooleanQuery.  It is
difficult to offer specific advice as I'm not sure what your use
cases are.

    Erik





   Mari Luz



Untitled Document --------------------------------------------------- Mari Luz Elola Developer Engineer Caleruega, 67 28033 Madrid (Spain) Tel.: +34 91 768 46 58 mailto: [EMAIL PROTECTED] --------------------------------------------------- Privileged/ Confidential Information may be contained in this message and is intended solely for the use of the named addressee(s). Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, any disclosure, copying, distribution or re- use of the information contained in it is prohibited and may be unlawful. Opinions, conclusions and any other information contained in this message that do not relate to the official business of Seinet shall be understood as neither given nor endorsed by it. If you have received this communication in error, please notify us immediately by replying to this mail and deleting it from your computer. Thank you. ----- Original Message ----- From: "Erik Hatcher" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Wednesday, July 06, 2005 8:12 PM
Subject: Re: OUTOFMEMORY ERROR


We'll need some more details to help.  What query was it?

    Erik

On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:



Hi, I have a problem when I am trying to search a simple query without sorting into an index with 210.000 documents.
Executing the query several times I am getting the OutOfMemory  error.
I am creating an IndexSearcher(pathDir) every search.
I don´t know if it will be necessary to create only one indexSearcher and caching it, If I search into an index with only 50.000 documents, the outofMemory error doen´t appear.
------------------------
ENVIROMENT DESCRIPTION:
------------------------

---SERVER---
MEMORY 2GB
APP SERVER Jboss3.2.3
JAVA_OPTS -Xmx640M -Xms640M

----LUCENE 1.4.3-------
INDEX +- 210.000 documents
EACH DOCUMENT +- 20 fields (metadatas)
SIZE TEXT DOCUMENT 1k

------------------------
ERROR:
------------------------
18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
java.lang.OutOfMemoryError
18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
java.lang.OutOfMemoryError
18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected Error; nested exception is:
        java.lang.OutOfMemoryError
18:52:18,661 ERROR [STDERR] at org.jboss.ejb.plugins.LogInterceptor.handleException (LogInterceptor.java:374) 18:52:18,661 ERROR [STDERR] at org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195) 18:52:18,661 ERROR [STDERR] at org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke (ProxyFactoryFinderInterceptor.java:122) 18:52:18,662 ERROR [STDERR] at org.jboss.ejb.StatelessSessionContainer.internalInvoke (StatelessSessionContainer.java:331) 18:52:18,662 ERROR [STDERR] at org.jboss.ejb.Container.invoke (Container.java:700) 18:52:18,662 ERROR [STDERR] at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source) 18:52:18,662 ERROR [STDERR] at sun.reflect.DelegatingMethodAccessorImpl.invok
.
.
Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS: Work queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of swap space?


Could anybody help me???

Thanks in advance

    Mari Luz














Reply via email to