Thanks
Che Dong http://www.chedong.com/
Chris Fraschetti wrote:
Surely some folks out there have used lucene on a large scale and have had to compensate for this somehow, any other solutions? Morus, thank you very more for your imput, and I am looking into your solution, just putting my feelers out there once more.
The lucene API is very limited as to it's descriptions of it's components, short of digging into the code, is there a good doc somewhere out there that explains the workins of lucene?
On Mon, 4 Oct 2004 01:57:06 -0700, Chris Fraschetti <[EMAIL PROTECTED]> wrote:
So before I spend a significant amount of time digging into the lucene code, how does your experience with lucene give light to my situation.... Our current index is pretty huge, and with each increase in side i've had, i've experienced a problem like this... Without taking up too much of your time.. because obviously this i my task, I thought i'd ask you if you'd had any experience with this boolean clause nonsense... of course it can be overcome, but if you know a quick hack, awesome, otherwise.. no big, but off to work i go :)
-Fraschetti
---------- Forwarded message ---------- From: Morus Walter <[EMAIL PROTECTED]> Date: Mon, 4 Oct 2004 09:01:50 +0200 Subject: Re: BooleanQuery - Too Many Clases on date range. To: Lucene Users List <[EMAIL PROTECTED]>, Chris Fraschetti <[EMAIL PROTECTED]>
Chris Fraschetti writes:
So i decicded to move my epoch date to the 20040608 date which fixed my boolean query problem in regards to my current data size (approx 600,000) ....
but now as soon as I do a query like ... a* I get the boolean error again. Google obviously can handle this query, and I'm pretty sure lucene can handle it.. any ideas? With out without a date dange specified i still get the TooManyClauses error.
I tired cranking the maxclauses up to Integer.MaxInt, but java gave me a out of memory error. Is this b/c the boolean search tried to allocate that many clauses by default or because my query actually needed that many clauses?
boolean search allocates clauses for all tokens having the prefix or matching the wildcard expression.
Why does it work on small indexes but not large?
Because there are fewer tokens starting with a.
Is there any way to have the parser create as many clauses as it can and then search with what it has? w/o recompiling the source?
You need to create your own version of Wildcard- and Prefix-Query that takes a maximum term number and ignores further clauses. And you need a variant of the query parser that uses these queries.
This can be done, even without recompiling lucene, but you will have to do some programming at the level of lucene queries. Shouldn't be hard, since you can use the sources as a starting point.
I guess this does not exist because the lucene developer decided to prefer a query error rather than uncomplete results.
Morus
-- ___________________________________________________ Chris Fraschetti, Student CompSci System Admin University of San Francisco e [EMAIL PROTECTED] | http://meteora.cs.usfca.edu
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]