We've noticed some pretty non-deterministic behavior with sharded
setups as well.

One thing we've noticed is that a query server can hang on to the set
of document ids that correspond to a given query even if caching is
off, which results in some weird behavior, such as a query like:

timestamp:[NOW TO NOW-8HOUR]

Will return no results, but:

timestamp:[NOW TO NOW-7HOUR]

...will, IF the former query was executed prior to a replication that
brought in documents that match both queries.

We've also noticed numFound changing during paging through query
results, as you mention.

One of our use cases is more of a reporting function and it depends on
there being more deterministic behavior than this, so in the shifting
numFound case, we've written code to detect a shift and restart the
query from the beginning.

In the case of cached documentIds not revealing fresher information,
I'm worried that we're going to have to move to querying each shard in
turn, which may mean we get left out of using SolrCloud. We haven't
tried to evaluate it yet, however.

Michael Della Bitta

------------------------------------------------
Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017
www.appinions.com
Where Influence Isn’t a Game


On Wed, Aug 8, 2012 at 8:10 AM, Rohit <ro...@in-rev.com> wrote:
> Hi,
>
>
>
> We are using Solr3.6 and 2 shards, we are noticing that when we fire a query
> with start as 0 and rows X the total numFound and the total numFound changes
> when we fire the same exact query with start as y and rows X.
>
>
>
> For example.
>
>
>
> First time
>
> query=abc&start=0&rows=4000
>
> numFound- 56000
>
>
>
> Second time
>
> query=abc&start=4000&rows=4000
>
> numFound- 55998
>
>
>
> What can cause this?
>
>
>
>
>
>
>
> Regards,
>
> Rohit
>
>
>

Reply via email to