On Thu, 29 Apr 2004, Tate Avery wrote: > Hello, > > I have been reviewing some of the code related to boolean queries and I > wanted to see if my understanding is approximately correct regarding how > they are handled and, more importantly, the limitations.
You can always submit requests for enhancements in bugzilla, so as to keep track this issue. > Here is what I have come to understand so far: > > 1) The QueryParser code generated from javacc will parse my boolean query > and determine for each clause whether or not is 'required' (based on a few > conditions, but, in short, whether or not it was introduced or followed by > 'AND') or 'prohibited' (based, in short, on it being preceded by 'NOT'). Your usage seems pretty particular, why are you using the javacc QueryParser? > 2) As my BooleanQuery is being constructed, it will throw a > BooleanQuery.TooManyClauses exception if I exceed > BooleanQuery.maxClauseCount (which defaults to 1024). It's configurable through sys properties or by BooleanQuery.setMaxClauseCount(int maxClauseCount) > > 3) The maxClauseCount threshold appears not to care whether or not my > clauses are 'required' or 'prohibited'... only how many of them there are in > total. > > 4) My BooleanQuery will prepare its own Scorer instance (i.e. > BooleanScorer). And, during this step, it will identify to the scorer which > clauses are 'required' or 'prohibited'. And, if more than 32 fall into this > category, a IndexOutOfBoundsException ("More than 32 required/prohibited > clauses in query.") is thrown. > That's as far as I got. > Now, I am a bit confused at this point. Does this mean I can make a boolean > query consisting of up to 1024 clauses as long as no more than 32 of them > are required or prohibited? This doesn't seem right. So, am I missing > something in the way I am understanding this. > I am (as you may have guessed) generating large boolean queries. And, in > some rare cases, I am receiving the exception identified in #4 (above). So, > I am trying to figure out whether or not I need to change/filter my queries > in a special way in order to avoid this exception. And, in order to do > this, I want to understand how these queries are being handled. > Finally, is there something related to the query syntax that could be my > mistake? For example, what is the difference between: > "A B" AND "C D" AND "D E" > ... and... > ("A B") AND ("C D") AND ("D E") > ... could that be the crux of it? I can't help you here, and the doc seems rather thin (or nonexistent for this class). I don't know the relation between the query and how the scorer will process it. Sorry I can't be of assistance, sv > Thank you for your time, > Tate Avery > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]