<<<I think Lucene sees all these values as one long value for the field "option">>> Not quite. Starting with the second add, a call will be made to getPositionIncrementGap in your analyzer. If you return a number larger than one, then the offsets between the last term of the preceeding add and the first term of this add will be that number. If you do nothing with getPositionIncrementGap, then Lucene does, indeed, see all the terms as one long value since it returns 1.....
Here's an illusrtation in which PositionIncrementGap returns an offset of 10 using your example adds. Note, these numbers may be off by the infamous 1 or even 2. term offset value1 0 aaa 1 value2 11 bbb 12 value3 22 ccc 23 So, you really don't care about the slop, since you can set it to less than the magic number you return from PositionIncrementGap. BTW, slop indicates holes, not total terms. So with a slop of 0 all the words need to be next to each other, regardless of whether there are two words or 20. But you still have to do the trick with getPositionIncrementGap in order to fail to match on something like "bbb value3", where the last term is next to the frist term of the next token...... HTH Erick On Mon, Oct 12, 2009 at 4:31 PM, Angel, Eric <ean...@business.com> wrote: > I need to analyze these values since I also want the benefits > porterStemmer. The problem with using PhraseQuery is that I don't > always know the slop. I may have values like "value4 ddd aaa". It's a > tricky problem because I think Lucene sees all these values as one long > value for the field "option". > > -----Original Message----- > From: Jake Mannix [mailto:jake.man...@gmail.com] > Sent: Monday, October 12, 2009 1:25 PM > To: java-user@lucene.apache.org > Subject: Re: querying multi-value fields > > Or else just make sure that you use PhraseQuery to hit this field when > you > want "value1 aaa". If you don't tokenize these pairs, then you will > have to > > do prefix/wildcard matching to hit just "value1" by itself (if this is > allowed > by your business logic). > > -jake > > On Mon, Oct 12, 2009 at 1:21 PM, Adriano Crestani > <adrianocrest...@gmail.com > > wrote: > > > Hi Eric, > > > > To achieve what you want, do not tokenize the values you query/add to > this > > field. > > > > On Mon, Oct 12, 2009 at 4:05 PM, Angel, Eric <ean...@business.com> > wrote: > > > > > I have documents that store multiple values in some fields (using > the > > > document.add(new Field()) with the same field name). Here's what a > > > typical document looks like: > > > > > > > > > > > > doc.option="value1 aaa" > > > > > > doc.option="value2 bbb" > > > > > > doc.option="value3 ccc" > > > > > > > > > > > > I want my queries to only match individual values, for example, a > query > > > for "value2 bbb" would return the above document, but a query for > > > "value1 ccc" should not. Is this at all possible in lucene at query > > > time? Could payloads be used for this? > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >