Tested it out and seems to work well as long as I set the gap to a value much longer than the text - 10000 appear to work fine for our current data. Thanks heaps for all the help guys!

Scott.

On 2/03/11 11:13 AM, Jonathan Rochkind wrote:
Each token has a position set on it. So if you index the value "alpha beta gamma", it winds up stored in Solr as (sort of, for the way we want to look at it)

document1:
    alpha:    position 1
    beta:    position 2
    gamma: postition 3

If you set the position increment gap large, then after one value in a multi-valued field ends, the position increment gap will be added to the positions for the next value. Solr doesn't actually internally have much of any idea of a multi-valued field, ALL a multi-valued indexed field is, is a position increment gap seperating tokens from different 'values'.

So index in a multi-valued field, with position increment gap 10000, the values: ["alpha beta gamma", "aleph bet"], you get kind of like:

document1:
    alpha: 1
    beta: 2
    gamma: 3
    aleph: 10004
    bet: 10005

A large position increment gap, as far as I know and can tell (please someone correct me if I'm wrong, I am not a Solr developer) has no effect on the size or efficiency of your index on disk.

I am not sure why positionIncrementGap doesn't just default to a very large number, to provide behavior that more matches what people expect from the idea of a "multi-valued field". So maybe there is some flaw in my understanding, that justifies some reason for it not to be this way?

But I set my positionIncrementGap very large, and haven't seen any issues.


On 3/1/2011 5:46 PM, Scott Yeadon wrote:
The only trick with this is ensuring the searches return the right
results and don't go across value boundaries. If I set the gap to the
largest text size we expect (approx 5000 chars) what impact does such a
large value have (i.e. does Solr physically separate these fragments in
the index or just apply the figure as part of any query?

Scott.

On 2/03/11 9:01 AM, Ahmet Arslan wrote:
In a multiValued field, call it field1, if I have two
values indexed to
this field, say value 1 = "some text...termA...more text"
and value 2 =
"some text...termB...more text" and do a search such as
field1:(termA termB)
(where<solrQueryParser defaultOperator="AND"/>) I'm
getting a hit
returned even though both terms don't occur within a single
value in the
multiValued field.

What I'm wondering is if there is a way of applying the
query against
each value of the field rather than against the field in
its entirety.
The reason being is the number of values I want to store is
variable and
I'd like to avoid the use of dynamic fields or
restructuring the index
if possible.
Your best bet can be using positionIncrementGap and to issue a phrase query (implicit AND) with the appropriate slop value.

Ff you have positionIncrementGap="100", you can simulate this with using
&q=field1:"termA termB"~100

http://search-lucene.com/m/Hbdvz1og7D71/







Reply via email to