Re: AND operator in multi valued fields

Alexandre Rafalovitch Thu, 18 Sep 2014 14:22:33 -0700

Well, I can think of four ways, increasingly complicated.

1) You could have both parent record with unzipped events and also
child events as individual documents. Then, you do filtering based on
children and highlighting based on parent documents.

2) The other way is to have a custom post filter that looks at the
matches and discards the ones that have different offset (by using
very large positionIncrementGap to create clear group boundaries). But
I don't know whether you can access the match token offsets in the
post filter, so this is more of a thought experiment.

3) You could also duplicate main field contents and be the "document"
one per event. If most of the fields are indexed, it's ok and no real
duplication. But you may need to store fields for highlighter and
those are not de-duplicated internally, as far as I know.

4) You could create zipped pairs of values in a dedicated field and
search that as near-queries. But than you do have to have the same
analyzer for all members. Sounds like this may not be an option for
you.

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853

On 18 September 2014 16:41, lboutros <boutr...@gmail.com> wrote:
> Thx Alex.
>
> We have main documents in the index. (more than 100 complex fields).
>
> Each document can have events attached.
>
> An event contains 4 fields with 3 different analyzers.
>
> We need more than just filtering on them (highlighting on documents and
> events at the same time for instance).
> That means that nested documents cannot be used.
>
> These events are indexed as additional multi valued fields in each
> documents.
> They are searched like any other field.
>
> The issue here is that the operator 'AND' between event fields can match
> false positives.
>
> We do not know the position during search. We just want to respect the event
> integrity in the search. So you are right, we just want them to be parallel
> within their tokenized groups ?
>
> The first idea was to index the event in only one field and use
> proximity/phrase search in order to prevent false positives.
>
> But that means that we need to index dates, ids and text in one unique
> field.
>
> Do you think this could be a better/easier approach ?
>
> Ludovic
>
>
>
> -----
> Jouve
> France.
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/AND-operator-in-multi-valued-fields-tp4159715p4159797.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: AND operator in multi valued fields

Reply via email to