Are you getting out of order scores? Or does the score change between
requests? Can you show us some results that you are getting so we might
see what's going on?

Upayavira

On Fri, Sep 11, 2015, at 05:07 AM, Modassar Ather wrote:
> Thanks Erick and Upayavira for the responses. One thing which I noticed
> in
> context of single sort field that the scores differ in each shard
> response.
> No score is identical in the response of one shard and they differ too in
> the responses from other shards. The score I got using fl=score.
> 
> Regards,
> Modassar
> 
> On Thu, Sep 10, 2015 at 8:45 PM, Erick Erickson <erickerick...@gmail.com>
> wrote:
> 
> > First, if Upayavira's intuition is correct (and I'm guessing it is),
> > then the behavior you're seeing is probably an accident of
> > coding rather than intentional. I think the algorithm is something
> > like this:
> >
> > Node1 gets the original query
> > Node1 sends sub-queries out to each shard.
> > As the results come back, they're sorted one by one into a final
> > list.
> >
> > For simplicity, let's claim _all_ the docs have the exact same score.
> > The _first_
> > shard's response will completely fill up the final list. The rest will
> > be thrown on
> > the floor as none of the docs from the other 6 shards will have a
> > higher score than
> > any doc currently in the list.
> >
> > Here's the important part. The order that the sub-requests come back varies
> > due to a zillion possible causes, network latency, a minor GC pause on one
> > of the shards, whether all the caches are loaded, whatever. So subsequent
> > calls will happen to get some _other_ shards docs in the list first.
> >
> > Does that make sense?
> >
> > On Thu, Sep 10, 2015 at 4:48 AM, Modassar Ather <modather1...@gmail.com>
> > wrote:
> > > If two documents come back from different
> > > shards with the same score, the order would not be predictable
> > >
> > > This is fine.
> > >
> > > What I am not able to understand is that when I do not give a secondary
> > > field for sort I am getting the result from one shard which changes to
> > > other shard in other hits. Here the results are always from one shard.
> > > E.g In first hit all the results are from shard1 and in next hit all the
> > > results are from shard2.
> > >
> > > But when I add the secondary sort field I see the results from multiple
> > > shards. E.g It has results from shard1 and shard2 both. This does not
> > > change in multiple hits.
> > >
> > > So please help me understand why the similar result merge and aggregation
> > > in not happening in when a single sort field is given?
> > >
> > > Regards,
> > > Modassar
> > >
> > >
> > >
> > > On Thu, Sep 10, 2015 at 5:03 PM, Upayavira <u...@odoko.co.uk> wrote:
> > >
> > >> What scores are you getting? If two documents come back from different
> > >> shards with the same score, the order would not be predictable -
> > >> probably down to which shard responds first.
> > >>
> > >> Fix it with something like sort=score,timestamp or some other time
> > >> related field.
> > >>
> > >> Upayavira
> > >>
> > >> On Thu, Sep 10, 2015, at 11:01 AM, Modassar Ather wrote:
> > >> > To add to my previous observation I saw the response having results
> > from
> > >> > multiple shards when the secondary sort field is added and they remain
> > >> > same
> > >> > across hits.
> > >> > Kindly help me understand this behavior. Why the results are changing
> > as
> > >> > I
> > >> > understand that the result should be first clubbed together from all
> > >> > shard
> > >> > and then based on their score it should be sorted.
> > >> > But here I see that every time I hit the sort query I am getting
> > results
> > >> > from different shard which has different scores.
> > >> >
> > >> > Thanks,
> > >> > Modassar
> > >> >
> > >> > On Thu, Sep 10, 2015 at 2:59 PM, Modassar Ather <
> > modather1...@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > Upayavira! I add the fl=id,score,[shard] and saw the shards
> > changing in
> > >> > > the response every time and for different shards the response
> > changes
> > >> but
> > >> > > for the same shard result is same on multiple hits.
> > >> > > When I add secondary sort field e.g. score the shard remains same
> > >> across
> > >> > > hits.
> > >> > >
> > >> > > On Thu, Sep 10, 2015 at 12:52 PM, Upayavira <u...@odoko.co.uk> wrote:
> > >> > >
> > >> > >> Add fl=id,score,[shard] to your query, and show us the results of
> > two
> > >> > >> differing executions.
> > >> > >>
> > >> > >> Perhaps we will be able to see the cause of the difference.
> > >> > >>
> > >> > >> Upayavira
> > >> > >>
> > >> > >> On Thu, Sep 10, 2015, at 05:35 AM, Modassar Ather wrote:
> > >> > >> > Thanks Erick. There are no replicas on my cluster and the
> > indexing
> > >> is
> > >> > >> one
> > >> > >> > time. No updates or additions are done to the index and the
> > >> segments are
> > >> > >> > optimized at the end of indexing.
> > >> > >> > So adding a secondary sort criteria is the only solution for such
> > >> issue
> > >> > >> > in
> > >> > >> > sort?
> > >> > >> >
> > >> > >> > Regards,
> > >> > >> > Modassar
> > >> > >> >
> > >> > >> > On Wed, Sep 9, 2015 at 8:21 PM, Erick Erickson <
> > >> erickerick...@gmail.com
> > >> > >> >
> > >> > >> > wrote:
> > >> > >> >
> > >> > >> > > When the primary sort criteria is identical for two documents,
> > >> > >> > > then the _internal_ Lucene document ID is used to break the
> > >> > >> > > tie. The internal ID for two docs can be not only different,
> > but
> > >> > >> > > in different _order_ on two separate shards. I'm assuming here
> > >> > >> > > that  each of your shards has multiple replicas and/or you're
> > >> > >> > > continuing to index to your cluster.
> > >> > >> > >
> > >> > >> > > The relative internal doc IDs for may change even relative to
> > >> > >> > > each other when segments get merged.
> > >> > >> > >
> > >> > >> > > So yes, if you are sorting by something that can be identical
> > >> > >> > > in documents, it's always best to specify a secondary sort
> > >> > >> > > criteria. It's not referenced unless there's a tie so it's
> > >> > >> > > not that expensive. People often use whatever field
> > >> > >> > > is defined for <uniqueKey> since that's _guaranteed_ to
> > >> > >> > > never be the same for two docs.
> > >> > >> > >
> > >> > >> > > Best,
> > >> > >> > > Erick
> > >> > >> > >
> > >> > >> > > On Wed, Sep 9, 2015 at 1:45 AM, Modassar Ather <
> > >> > >> modather1...@gmail.com>
> > >> > >> > > wrote:
> > >> > >> > > > Hi,
> > >> > >> > > >
> > >> > >> > > > Search results are changed every time the following query is
> > >> hit.
> > >> > >> Please
> > >> > >> > > > note that it is 7 shard cluster of Solr-5.2.1.
> > >> > >> > > >
> > >> > >> > > > Query: q=network&start=50&rows=50&sort=f_sort
> > >> > >> > > asc&group=true&group.field=id
> > >> > >> > > >
> > >> > >> > > > Following are the fields and their types in my schema.xml.
> > >> > >> > > >
> > >> > >> > > > <fieldType name="string" class="solr.StrField"
> > >> > >> sortMissingLast="true"
> > >> > >> > > > stored="false" omitNorms="true"/>
> > >> > >> > > > <fieldType name="string_dv" class="solr.StrField"
> > >> > >> sortMissingLast="true"
> > >> > >> > > > stored="false" indexed="true" docValues="true"/>
> > >> > >> > > >
> > >> > >> > > > <field name="id" type="string" stored="true"/>
> > >> > >> > > > <dynamicField name="*_sort" type="string_dv"/>
> > >> > >> > > >
> > >> > >> > > > As per my understanding it seems to be the issue of tie among
> > >> the
> > >> > >> > > document
> > >> > >> > > > as when I added a new sort field like below the result never
> > >> changed
> > >> > >> > > across
> > >> > >> > > > multiple hits.
> > >> > >> > > > q=network&start=50&rows=50&sort=f_sort asc, score
> > >> > >> > > > asc&group=true&group.field=id
> > >> > >> > > >
> > >> > >> > > > Kindly let me know if this is an issue or how this can be
> > fixed.
> > >> > >> > > >
> > >> > >> > > > Thanks,
> > >> > >> > > > Modassar
> > >> > >> > >
> > >> > >>
> > >> > >
> > >> > >
> > >>
> >

Reply via email to