RE: solr query gives different numFound upon refreshing
We wrote a script which queries each Solr instance in cloud (http://$host/solr/replication?command=details) and subtracts the ‘replicableVersion’ number from the ‘indexVersion’ number, converts to minutes, and alerts if the minutes exceed 20. We get alerted many times a day. The soft commit setting is every 7 minutes. Any idea what might be wrong here? This is our commit setting. 15000 10 false 45 We got rid of all max new searcher errors. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Thursday, September 04, 2014 6:07 PM To: solr-user@lucene.apache.org Subject: Re: solr query gives different numFound upon refreshing Does this persist if you issue a hard commit? You can do something like http://solr/collection/update?stream.body= On Thu, Sep 4, 2014 at 2:19 PM, shamik wrote: > I've noticed similar behavior with our Solr cloud cluster for a while, it's > random though. We've 2 shards with 3 replicas each. At times, I've observed > that the same query on refresh will fetch different results (numFound) as > well as the content. The only way to mitigate is to refresh the index with > the documents till the nodes are in sync. I always use SolrJ which talks to > Solr through zookeeper, even with that it seemed to be unavoidable at times. > We are committing every 10 mins. I'm pretty much sure there's a minor glitch > which creates a sync issue at times. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/solr-query-gives-different-numFound-upon-refreshing-tp4155414p4157026.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr query gives different numFound upon refreshing
Does this persist if you issue a hard commit? You can do something like http://solr/collection/update?stream.body= On Thu, Sep 4, 2014 at 2:19 PM, shamik wrote: > I've noticed similar behavior with our Solr cloud cluster for a while, it's > random though. We've 2 shards with 3 replicas each. At times, I've observed > that the same query on refresh will fetch different results (numFound) as > well as the content. The only way to mitigate is to refresh the index with > the documents till the nodes are in sync. I always use SolrJ which talks to > Solr through zookeeper, even with that it seemed to be unavoidable at times. > We are committing every 10 mins. I'm pretty much sure there's a minor glitch > which creates a sync issue at times. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/solr-query-gives-different-numFound-upon-refreshing-tp4155414p4157026.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr query gives different numFound upon refreshing
I've noticed similar behavior with our Solr cloud cluster for a while, it's random though. We've 2 shards with 3 replicas each. At times, I've observed that the same query on refresh will fetch different results (numFound) as well as the content. The only way to mitigate is to refresh the index with the documents till the nodes are in sync. I always use SolrJ which talks to Solr through zookeeper, even with that it seemed to be unavoidable at times. We are committing every 10 mins. I'm pretty much sure there's a minor glitch which creates a sync issue at times. -- View this message in context: http://lucene.472066.n3.nabble.com/solr-query-gives-different-numFound-upon-refreshing-tp4155414p4157026.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr query gives different numFound upon refreshing
bq: What is Master Searching vs. Master Replicable vs Slave Searching Likely leftover from the days when master/slave was the only option. You can pretty much ignore it in SorlrCloud I think. Best, Erick On Fri, Aug 29, 2014 at 11:38 AM, Joshi, Shital wrote: > Eric, > > Thanks your reply. > > We will increase autocommit setting and let you know. > > We are using Solr Cloud (4.8.0). When from the Solr admin gui, select a > collection and see the Overview tab, We see three versions of index though > we have just 1 replica. > > Master (Searching) > Master (Replicable) > Slave (Searching) > > What is Master Searching vs. Master Replicable vs Slave Searching? > > Thanks. > > -Original Message- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Friday, August 29, 2014 12:22 AM > To: solr-user@lucene.apache.org > Subject: Re: solr query gives different numFound upon refreshing > > First, I want to be sure you're not mixing old-style > replication and SolrCloud. Your use of Master/Slave > causes this question. > > Second, your maxWarmingSearchers error indicates that > your commit interval is too short relative to your autowarm > times. Try lengthening your autocommit settings (probably > soft commit) until you no longer see that error message > and see if the problem goes away. If it doesn't, let us know. > > Best, > Erick > > > > On Thu, Aug 28, 2014 at 9:39 AM, Joshi, Shital > wrote: > > > Hi Shawn, > > > > Thanks for your reply. > > > > We did some tests enabling shards.info=true and confirmed that there is > > not duplicate copy of our index. > > > > We have one replica but many times we see three versions on Admin > > GUI/Overview tab. All three has different versions and gen. Is that a > > problem? > > Master (Searching) > > Master (Replicable) > > Slave (Searching) > > > > We constantly see max searcher open exception. The warmup time is 1.5 > > minutes but the difference between openedAt date and registeredAt date is > > at times more than 4-5 minutes. Is the true searcher time the difference > > between two dates and not the warmupTime? > > > > openedAt: 2014-08-28T16:17:24.829Z > > registeredAt: 2014-08-28T16:21:02.278Z > > warmupTime: 65727 > > > > Thanks for all help. > > > > > > -Original Message- > > From: Shawn Heisey [mailto:s...@elyograg.org] > > Sent: Wednesday, August 27, 2014 2:37 PM > > To: solr-user@lucene.apache.org > > Subject: Re: solr query gives different numFound upon refreshing > > > > On 8/27/2014 10:44 AM, Bryan Bende wrote: > > > Theoretically this shouldn't happen, but is it possible that the two > > > replicas for a given shard are not fully in sync? > > > > > > Say shard1 replica1 is missing a document that is in shard1 replica2... > > if > > > you run a query that would hit on that document and run it a bunch of > > > times, sometimes replica 1 will handle the request and sometimes > replica > > 2 > > > will handle it, and it would change your number of results if one of > them > > > is missing a document. You could write a program that compares each > > > replica's documents by querying them with distrib=false. > > > > > > If there was a replica out of sync, I would think it would detect that > > on a > > > restart when comparing itself against the leader for that shard, but > I'm > > > not sure. > > > > A replica out of sync is a possibility, but the most common reason for a > > changing numFound is because the overall distributed index has more than > > one document with the same uniqueKey value -- different versions of the > > same document in more than one shard. > > > > SolrCloud tries really hard to never end up with replicas out of sync, > > but either due to highly unusual circumstances or bugs, it could still > > happen. > > > > Thanks, > > Shawn > > > > >
RE: solr query gives different numFound upon refreshing
Eric, Thanks your reply. We will increase autocommit setting and let you know. We are using Solr Cloud (4.8.0). When from the Solr admin gui, select a collection and see the Overview tab, We see three versions of index though we have just 1 replica. Master (Searching) Master (Replicable) Slave (Searching) What is Master Searching vs. Master Replicable vs Slave Searching? Thanks. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Friday, August 29, 2014 12:22 AM To: solr-user@lucene.apache.org Subject: Re: solr query gives different numFound upon refreshing First, I want to be sure you're not mixing old-style replication and SolrCloud. Your use of Master/Slave causes this question. Second, your maxWarmingSearchers error indicates that your commit interval is too short relative to your autowarm times. Try lengthening your autocommit settings (probably soft commit) until you no longer see that error message and see if the problem goes away. If it doesn't, let us know. Best, Erick On Thu, Aug 28, 2014 at 9:39 AM, Joshi, Shital wrote: > Hi Shawn, > > Thanks for your reply. > > We did some tests enabling shards.info=true and confirmed that there is > not duplicate copy of our index. > > We have one replica but many times we see three versions on Admin > GUI/Overview tab. All three has different versions and gen. Is that a > problem? > Master (Searching) > Master (Replicable) > Slave (Searching) > > We constantly see max searcher open exception. The warmup time is 1.5 > minutes but the difference between openedAt date and registeredAt date is > at times more than 4-5 minutes. Is the true searcher time the difference > between two dates and not the warmupTime? > > openedAt: 2014-08-28T16:17:24.829Z > registeredAt: 2014-08-28T16:21:02.278Z > warmupTime: 65727 > > Thanks for all help. > > > -Original Message- > From: Shawn Heisey [mailto:s...@elyograg.org] > Sent: Wednesday, August 27, 2014 2:37 PM > To: solr-user@lucene.apache.org > Subject: Re: solr query gives different numFound upon refreshing > > On 8/27/2014 10:44 AM, Bryan Bende wrote: > > Theoretically this shouldn't happen, but is it possible that the two > > replicas for a given shard are not fully in sync? > > > > Say shard1 replica1 is missing a document that is in shard1 replica2... > if > > you run a query that would hit on that document and run it a bunch of > > times, sometimes replica 1 will handle the request and sometimes replica > 2 > > will handle it, and it would change your number of results if one of them > > is missing a document. You could write a program that compares each > > replica's documents by querying them with distrib=false. > > > > If there was a replica out of sync, I would think it would detect that > on a > > restart when comparing itself against the leader for that shard, but I'm > > not sure. > > A replica out of sync is a possibility, but the most common reason for a > changing numFound is because the overall distributed index has more than > one document with the same uniqueKey value -- different versions of the > same document in more than one shard. > > SolrCloud tries really hard to never end up with replicas out of sync, > but either due to highly unusual circumstances or bugs, it could still > happen. > > Thanks, > Shawn > >
Re: solr query gives different numFound upon refreshing
First, I want to be sure you're not mixing old-style replication and SolrCloud. Your use of Master/Slave causes this question. Second, your maxWarmingSearchers error indicates that your commit interval is too short relative to your autowarm times. Try lengthening your autocommit settings (probably soft commit) until you no longer see that error message and see if the problem goes away. If it doesn't, let us know. Best, Erick On Thu, Aug 28, 2014 at 9:39 AM, Joshi, Shital wrote: > Hi Shawn, > > Thanks for your reply. > > We did some tests enabling shards.info=true and confirmed that there is > not duplicate copy of our index. > > We have one replica but many times we see three versions on Admin > GUI/Overview tab. All three has different versions and gen. Is that a > problem? > Master (Searching) > Master (Replicable) > Slave (Searching) > > We constantly see max searcher open exception. The warmup time is 1.5 > minutes but the difference between openedAt date and registeredAt date is > at times more than 4-5 minutes. Is the true searcher time the difference > between two dates and not the warmupTime? > > openedAt: 2014-08-28T16:17:24.829Z > registeredAt: 2014-08-28T16:21:02.278Z > warmupTime: 65727 > > Thanks for all help. > > > -Original Message- > From: Shawn Heisey [mailto:s...@elyograg.org] > Sent: Wednesday, August 27, 2014 2:37 PM > To: solr-user@lucene.apache.org > Subject: Re: solr query gives different numFound upon refreshing > > On 8/27/2014 10:44 AM, Bryan Bende wrote: > > Theoretically this shouldn't happen, but is it possible that the two > > replicas for a given shard are not fully in sync? > > > > Say shard1 replica1 is missing a document that is in shard1 replica2... > if > > you run a query that would hit on that document and run it a bunch of > > times, sometimes replica 1 will handle the request and sometimes replica > 2 > > will handle it, and it would change your number of results if one of them > > is missing a document. You could write a program that compares each > > replica's documents by querying them with distrib=false. > > > > If there was a replica out of sync, I would think it would detect that > on a > > restart when comparing itself against the leader for that shard, but I'm > > not sure. > > A replica out of sync is a possibility, but the most common reason for a > changing numFound is because the overall distributed index has more than > one document with the same uniqueKey value -- different versions of the > same document in more than one shard. > > SolrCloud tries really hard to never end up with replicas out of sync, > but either due to highly unusual circumstances or bugs, it could still > happen. > > Thanks, > Shawn > >
RE: solr query gives different numFound upon refreshing
Hi Shawn, Thanks for your reply. We did some tests enabling shards.info=true and confirmed that there is not duplicate copy of our index. We have one replica but many times we see three versions on Admin GUI/Overview tab. All three has different versions and gen. Is that a problem? Master (Searching) Master (Replicable) Slave (Searching) We constantly see max searcher open exception. The warmup time is 1.5 minutes but the difference between openedAt date and registeredAt date is at times more than 4-5 minutes. Is the true searcher time the difference between two dates and not the warmupTime? openedAt: 2014-08-28T16:17:24.829Z registeredAt: 2014-08-28T16:21:02.278Z warmupTime: 65727 Thanks for all help. -Original Message- From: Shawn Heisey [mailto:s...@elyograg.org] Sent: Wednesday, August 27, 2014 2:37 PM To: solr-user@lucene.apache.org Subject: Re: solr query gives different numFound upon refreshing On 8/27/2014 10:44 AM, Bryan Bende wrote: > Theoretically this shouldn't happen, but is it possible that the two > replicas for a given shard are not fully in sync? > > Say shard1 replica1 is missing a document that is in shard1 replica2... if > you run a query that would hit on that document and run it a bunch of > times, sometimes replica 1 will handle the request and sometimes replica 2 > will handle it, and it would change your number of results if one of them > is missing a document. You could write a program that compares each > replica's documents by querying them with distrib=false. > > If there was a replica out of sync, I would think it would detect that on a > restart when comparing itself against the leader for that shard, but I'm > not sure. A replica out of sync is a possibility, but the most common reason for a changing numFound is because the overall distributed index has more than one document with the same uniqueKey value -- different versions of the same document in more than one shard. SolrCloud tries really hard to never end up with replicas out of sync, but either due to highly unusual circumstances or bugs, it could still happen. Thanks, Shawn
Re: solr query gives different numFound upon refreshing
On 8/27/2014 10:44 AM, Bryan Bende wrote: > Theoretically this shouldn't happen, but is it possible that the two > replicas for a given shard are not fully in sync? > > Say shard1 replica1 is missing a document that is in shard1 replica2... if > you run a query that would hit on that document and run it a bunch of > times, sometimes replica 1 will handle the request and sometimes replica 2 > will handle it, and it would change your number of results if one of them > is missing a document. You could write a program that compares each > replica's documents by querying them with distrib=false. > > If there was a replica out of sync, I would think it would detect that on a > restart when comparing itself against the leader for that shard, but I'm > not sure. A replica out of sync is a possibility, but the most common reason for a changing numFound is because the overall distributed index has more than one document with the same uniqueKey value -- different versions of the same document in more than one shard. SolrCloud tries really hard to never end up with replicas out of sync, but either due to highly unusual circumstances or bugs, it could still happen. Thanks, Shawn
Re: solr query gives different numFound upon refreshing
Theoretically this shouldn't happen, but is it possible that the two replicas for a given shard are not fully in sync? Say shard1 replica1 is missing a document that is in shard1 replica2... if you run a query that would hit on that document and run it a bunch of times, sometimes replica 1 will handle the request and sometimes replica 2 will handle it, and it would change your number of results if one of them is missing a document. You could write a program that compares each replica's documents by querying them with distrib=false. If there was a replica out of sync, I would think it would detect that on a restart when comparing itself against the leader for that shard, but I'm not sure. On Wed, Aug 27, 2014 at 11:37 AM, Joshi, Shital wrote: > Hi, > > We have SolrCloud cluster (5 shards and 2 replicas) on 10 boxes. We have > three collections. We recently upgraded from 4.4.0 from 4.8. We have ~850 > mil documents. > > We are facing an issue where refreshing a Solr query may give different > results (number of documents returned). This issue is seen in all three > collections. > > We found that Solr admin would report Solr instance states as not > “current”. Is it indicative of the above issue? > > We checked logs and found various errors/warnings, but they don’t seem to > be indicative of the above issue (or if they are – it’s not yet > clear/obvious or maybe indirectly related). The error message is like this: > 8/27/2014 2:01:24 AMERROR SolrCmdDistributor > org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error > opening new searcher. exceeded limit of maxWarmingSearchers=2, try again > later. > > This is our autocommit setting. > > >15000 >10 >false > > >30 > > The searcher takes less than 1.5 minutes and the soft commit setting is > set for every 5 minutes. So there is no way to end up with more than two > searchers. > > The searcher registeredAttime and openedAttime are sometimes 12-13 hours > old and we end up bouncing could. > > Any help to solve this issue is appreciated. > > > > > > > > >
solr query gives different numFound upon refreshing
Hi, We have SolrCloud cluster (5 shards and 2 replicas) on 10 boxes. We have three collections. We recently upgraded from 4.4.0 from 4.8. We have ~850 mil documents. We are facing an issue where refreshing a Solr query may give different results (number of documents returned). This issue is seen in all three collections. We found that Solr admin would report Solr instance states as not “current”. Is it indicative of the above issue? We checked logs and found various errors/warnings, but they don’t seem to be indicative of the above issue (or if they are – it’s not yet clear/obvious or maybe indirectly related). The error message is like this: 8/27/2014 2:01:24 AMERROR SolrCmdDistributor org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later. This is our autocommit setting. 15000 10 false 30 The searcher takes less than 1.5 minutes and the soft commit setting is set for every 5 minutes. So there is no way to end up with more than two searchers. The searcher registeredAttime and openedAttime are sometimes 12-13 hours old and we end up bouncing could. Any help to solve this issue is appreciated.