Re: inconsistent results
On 5/3/2018 12:55 PM, Satya Marivada wrote: > We have a solr (6.3.0) index which is being re-indexed every night, it > takes about 6-7 hours for the indexing to complete. During the time of > re-indexing, the index becomes flaky and would serve inconsistent count of > documents 70,000 at times and 80,000 at times. After the indexing is > completed, it serves the consistent and right number of documents that it > has indexed from the database. Any suggestions on this. Initial guess is that there are commits being fired before the whole indexing process is complete. If you're running in cloud mode, there could be other things going on. > Also solr writes to the same location as current index during re-indexing. > Could this be the cause of concern? When you use an existing index as the write location for a re-index, you must be very careful to ensure that you do not ever send any commit requests before the entire indexing process is complete. The autoCommit config in solrconfig.xml must have openSearcher set to false, and autoSoftCommit must not be active. That way, all queries sent before the process completes will be handled by the index that existed before the indexing process started. A commit when the process is done will send new queries to the new state of the index. An alternate idea would be to index the replacement index into a different core/collection, and then swap the indexes. In SolrCloud mode, the swap would be accomplished using the Collection Alias feature. Thanks, Shawn
Re: inconsistent results
Yes, we are doing clean and full import. Is it not supposed to serve old(existing) index till the new index is built and then do a cleanup, replace old index after new index is built? Would a full import without clean not give this problem? Thanks Erick, this would be useful. On Thu, May 3, 2018, 4:28 PM Erick Ericksonwrote: > The short for is that different replicas in a shard have different > commit point if you go by wall-clock time. So during heavy indexing, > you can happen to catch the different counts. That really shouldn't > happen, though, unless you're clearing the index first on the > assumption that you're replacing the same docs each time > > One solution people use is to index to a "dark" collection, then use > collection aliasing to atomically switch when the job is done. > > Best, > Erick > > > On Thu, May 3, 2018 at 11:55 AM, Satya Marivada > wrote: > > Hi there, > > > > We have a solr (6.3.0) index which is being re-indexed every night, it > > takes about 6-7 hours for the indexing to complete. During the time of > > re-indexing, the index becomes flaky and would serve inconsistent count > of > > documents 70,000 at times and 80,000 at times. After the indexing is > > completed, it serves the consistent and right number of documents that it > > has indexed from the database. Any suggestions on this. > > > > Also solr writes to the same location as current index during > re-indexing. > > Could this be the cause of concern? > > > > Thanks, > > Satya >
Re: inconsistent results
The short for is that different replicas in a shard have different commit point if you go by wall-clock time. So during heavy indexing, you can happen to catch the different counts. That really shouldn't happen, though, unless you're clearing the index first on the assumption that you're replacing the same docs each time One solution people use is to index to a "dark" collection, then use collection aliasing to atomically switch when the job is done. Best, Erick On Thu, May 3, 2018 at 11:55 AM, Satya Marivadawrote: > Hi there, > > We have a solr (6.3.0) index which is being re-indexed every night, it > takes about 6-7 hours for the indexing to complete. During the time of > re-indexing, the index becomes flaky and would serve inconsistent count of > documents 70,000 at times and 80,000 at times. After the indexing is > completed, it serves the consistent and right number of documents that it > has indexed from the database. Any suggestions on this. > > Also solr writes to the same location as current index during re-indexing. > Could this be the cause of concern? > > Thanks, > Satya
inconsistent results
Hi there, We have a solr (6.3.0) index which is being re-indexed every night, it takes about 6-7 hours for the indexing to complete. During the time of re-indexing, the index becomes flaky and would serve inconsistent count of documents 70,000 at times and 80,000 at times. After the indexing is completed, it serves the consistent and right number of documents that it has indexed from the database. Any suggestions on this. Also solr writes to the same location as current index during re-indexing. Could this be the cause of concern? Thanks, Satya
Re: Inconsistent results for facet queries
I'm not sure if that method is viable for reindexing and fetching the whole collection at once for us, but unless there is something inherent in that process which happens at the collection level, we could do it a few shards at a time since it is a multi-tenant setup. I'll see if we can setup a small test in QA for this, and test it out. This facet issue is the only one we've noticed and is able to be worked around, so we might end up just waiting until we reindex for version 7.X to permanently fix it. Thanks Chris On Thu, Oct 12, 2017 at 1:41 PM Erick Ericksonwrote: > (1) It doesn't matter whether it "affect only segments being merged". > You can't get accurate information if different segments have > different expectations. > > (2) I strongly doubt it. The problem is that the "tainted" segments' > meta-data is still read when merging. If the segment consisted of > _only_ deleted documents you'd probably lose it, but it'll be > re-merged long before it consists of exclusively deleted documents. > > Really, you have to re-index to be sure, I suspect you can find some > way to do this faster than exploring undefined behavior and hoping. > > If you can re-index _anywhere_ to a collection with the same number of > shards you can get this done, it'll take some tricky dancing but > > 0> copy one index directory from each shard someplace safe. > 1> reindex somewhere, single-replica will do. > 2> Delete all replicas except one for your current collection > 3> issue an admin API command fetchindex for each replica in old > collection, pulling the index "from the right place" in the new > collection. It's important that there only be a single replica for > each shard active at this point. These two collection do _not_ need to > be part of the same SolrCloud, the fetchindex command just takes a URL > of the core to fetch from. > 4> add the replicas back and let them replicate. > > Your installation would be unavailable for searching during steps 2-4 of > course. > > Best, > Erick > > On Thu, Oct 12, 2017 at 9:01 AM, Chris Ulicny wrote: > > We tested the query on all replicas for the given shard, and they all > have > > the same issue. So deleting and adding another replica won't fix the > > problem since the leader is exhibiting the behavior as well. I believe > the > > second replica was moved (new one added, old one deleted) between nodes > and > > so was just a copy of the leader's index after the problematic merge > > happened. > > > > bq: Anything that didn't merge old segments, just threw them > > away when empty (which was my idea) would possibly require as much > > disk space as the index currently occupied, so doesn't help your > > disk-constrained situation. > > > > Something like this was originally what I thought might fix the issue. If > > we reindex the data for the affected shard, it would possibly delete all > > docs from the old segments and just drop them instead of merging them. As > > mentioned, you'd expect the problems to persist through subsequent > merges. > > So I've got two questions > > > > 1) If the problem persists through merges, does it only affect the > segments > > being merged, and then when solr goes looking for the values, it comes up > > empty? Instead of all segments being affected by a single merge they > > weren't a part of. > > > > 2) Is it expected that any large tainted segments will eventually merge > > with clean segments resulting in more tainted segments as enough docs are > > deleted on the large segments? > > > > Also, we aren't disk constrained as much as previously. Reindexing a > subset > > of docs is possible, but a full clean collection reindex isn't. > > > > Thanks, > > Chris > > > > > > On Thu, Oct 12, 2017 at 11:13 AM Erick Erickson > > > wrote: > > > >> Never mind. Anything that didn't merge old segments, just threw them > >> away when empty (which was my idea) would possibly require as much > >> disk space as the index currently occupied, so doesn't help your > >> disk-constrained situation. > >> > >> Best, > >> Erick > >> > >> On Thu, Oct 12, 2017 at 8:06 AM, Erick Erickson < > erickerick...@gmail.com> > >> wrote: > >> > If it's _only_ on a particular replica, here's what you could do: > >> > Just DELETEREPLICA on it, then ADDREPLICA to bring it back. You can > >> > define the "node" parameter on ADDREPLICA to get it back on the same > >> > node. Then the normal replication process would pull the entire index > >> > down from the leader. > >> > > >> > My bet, though, is that this wouldn't really fix things. While it > fixes > >> the > >> > particular case you've noticed I'd guess others would pop up. You can > >> > see what replicas return what by firing individual queries at the > >> > particular replica in question with =false, something like > >> > > >> > solr_server:port/solr/collection1_shard1_replica1/query?distrib=false > >> > blah blah > >> > > >> > > >> > bq: It
Re: Inconsistent results for facet queries
(1) It doesn't matter whether it "affect only segments being merged". You can't get accurate information if different segments have different expectations. (2) I strongly doubt it. The problem is that the "tainted" segments' meta-data is still read when merging. If the segment consisted of _only_ deleted documents you'd probably lose it, but it'll be re-merged long before it consists of exclusively deleted documents. Really, you have to re-index to be sure, I suspect you can find some way to do this faster than exploring undefined behavior and hoping. If you can re-index _anywhere_ to a collection with the same number of shards you can get this done, it'll take some tricky dancing but 0> copy one index directory from each shard someplace safe. 1> reindex somewhere, single-replica will do. 2> Delete all replicas except one for your current collection 3> issue an admin API command fetchindex for each replica in old collection, pulling the index "from the right place" in the new collection. It's important that there only be a single replica for each shard active at this point. These two collection do _not_ need to be part of the same SolrCloud, the fetchindex command just takes a URL of the core to fetch from. 4> add the replicas back and let them replicate. Your installation would be unavailable for searching during steps 2-4 of course. Best, Erick On Thu, Oct 12, 2017 at 9:01 AM, Chris Ulicnywrote: > We tested the query on all replicas for the given shard, and they all have > the same issue. So deleting and adding another replica won't fix the > problem since the leader is exhibiting the behavior as well. I believe the > second replica was moved (new one added, old one deleted) between nodes and > so was just a copy of the leader's index after the problematic merge > happened. > > bq: Anything that didn't merge old segments, just threw them > away when empty (which was my idea) would possibly require as much > disk space as the index currently occupied, so doesn't help your > disk-constrained situation. > > Something like this was originally what I thought might fix the issue. If > we reindex the data for the affected shard, it would possibly delete all > docs from the old segments and just drop them instead of merging them. As > mentioned, you'd expect the problems to persist through subsequent merges. > So I've got two questions > > 1) If the problem persists through merges, does it only affect the segments > being merged, and then when solr goes looking for the values, it comes up > empty? Instead of all segments being affected by a single merge they > weren't a part of. > > 2) Is it expected that any large tainted segments will eventually merge > with clean segments resulting in more tainted segments as enough docs are > deleted on the large segments? > > Also, we aren't disk constrained as much as previously. Reindexing a subset > of docs is possible, but a full clean collection reindex isn't. > > Thanks, > Chris > > > On Thu, Oct 12, 2017 at 11:13 AM Erick Erickson > wrote: > >> Never mind. Anything that didn't merge old segments, just threw them >> away when empty (which was my idea) would possibly require as much >> disk space as the index currently occupied, so doesn't help your >> disk-constrained situation. >> >> Best, >> Erick >> >> On Thu, Oct 12, 2017 at 8:06 AM, Erick Erickson >> wrote: >> > If it's _only_ on a particular replica, here's what you could do: >> > Just DELETEREPLICA on it, then ADDREPLICA to bring it back. You can >> > define the "node" parameter on ADDREPLICA to get it back on the same >> > node. Then the normal replication process would pull the entire index >> > down from the leader. >> > >> > My bet, though, is that this wouldn't really fix things. While it fixes >> the >> > particular case you've noticed I'd guess others would pop up. You can >> > see what replicas return what by firing individual queries at the >> > particular replica in question with =false, something like >> > >> solr_server:port/solr/collection1_shard1_replica1/query?distrib=false >> > blah blah >> > >> > >> > bq: It is exceedingly unfortunate that reindexing the data on that shard >> only >> > probably won't end up fixing the problem >> > >> > Well, we've been working on the DWIM (Do What I Mean) feature for years, >> > but progress has stalled. >> > >> > How would that work? You have two segments with vastly different >> > characteristics for a field. You could change the type, the >> multiValued-ness, >> > the analysis chain, there's no end to the things that could go wrong. >> Fixing >> > them actually _is_ impossible given how Lucene is structured. >> > >> > H, you've now given me a brainstorm I'll suggest on the JIRA >> > system after I talk to the dev list >> > >> > Consider indexed=true stored=false. After stemming, "running" can be >> > indexed as "run". At merge time you have no way of knowing that
Re: Inconsistent results for facet queries
We tested the query on all replicas for the given shard, and they all have the same issue. So deleting and adding another replica won't fix the problem since the leader is exhibiting the behavior as well. I believe the second replica was moved (new one added, old one deleted) between nodes and so was just a copy of the leader's index after the problematic merge happened. bq: Anything that didn't merge old segments, just threw them away when empty (which was my idea) would possibly require as much disk space as the index currently occupied, so doesn't help your disk-constrained situation. Something like this was originally what I thought might fix the issue. If we reindex the data for the affected shard, it would possibly delete all docs from the old segments and just drop them instead of merging them. As mentioned, you'd expect the problems to persist through subsequent merges. So I've got two questions 1) If the problem persists through merges, does it only affect the segments being merged, and then when solr goes looking for the values, it comes up empty? Instead of all segments being affected by a single merge they weren't a part of. 2) Is it expected that any large tainted segments will eventually merge with clean segments resulting in more tainted segments as enough docs are deleted on the large segments? Also, we aren't disk constrained as much as previously. Reindexing a subset of docs is possible, but a full clean collection reindex isn't. Thanks, Chris On Thu, Oct 12, 2017 at 11:13 AM Erick Ericksonwrote: > Never mind. Anything that didn't merge old segments, just threw them > away when empty (which was my idea) would possibly require as much > disk space as the index currently occupied, so doesn't help your > disk-constrained situation. > > Best, > Erick > > On Thu, Oct 12, 2017 at 8:06 AM, Erick Erickson > wrote: > > If it's _only_ on a particular replica, here's what you could do: > > Just DELETEREPLICA on it, then ADDREPLICA to bring it back. You can > > define the "node" parameter on ADDREPLICA to get it back on the same > > node. Then the normal replication process would pull the entire index > > down from the leader. > > > > My bet, though, is that this wouldn't really fix things. While it fixes > the > > particular case you've noticed I'd guess others would pop up. You can > > see what replicas return what by firing individual queries at the > > particular replica in question with =false, something like > > > solr_server:port/solr/collection1_shard1_replica1/query?distrib=false > > blah blah > > > > > > bq: It is exceedingly unfortunate that reindexing the data on that shard > only > > probably won't end up fixing the problem > > > > Well, we've been working on the DWIM (Do What I Mean) feature for years, > > but progress has stalled. > > > > How would that work? You have two segments with vastly different > > characteristics for a field. You could change the type, the > multiValued-ness, > > the analysis chain, there's no end to the things that could go wrong. > Fixing > > them actually _is_ impossible given how Lucene is structured. > > > > H, you've now given me a brainstorm I'll suggest on the JIRA > > system after I talk to the dev list > > > > Consider indexed=true stored=false. After stemming, "running" can be > > indexed as "run". At merge time you have no way of knowing that > > "running" was the original term so you simply couldn't fix it on merge, > > not to mention that the performance penalty would be...er... > > severe. > > > > Best, > > Erick > > > > On Thu, Oct 12, 2017 at 5:53 AM, Chris Ulicny wrote: > >> I thought that decision would come back to bite us somehow. At the > time, we > >> didn't have enough space available to do a fresh reindex alongside the > old > >> collection, so the only course of action available was to index over the > >> old one, and the vast majority of its use worked as expected. > >> > >> We're planning on upgrading to version 7 at some point in the near > future > >> and will have enough space to do a full, clean reindex at that time. > >> > >> bq: This can propagate through all following segment merges IIUC. > >> > >> It is exceedingly unfortunate that reindexing the data on that shard > only > >> probably won't end up fixing the problem. > >> > >> Out of curiosity, are there any good write-ups or documentation on how > two > >> (or more) lucene segments are merged, or is it just worth looking at the > >> source code to figure that out? > >> > >> Thanks, > >> Chris > >> > >> On Wed, Oct 11, 2017 at 6:55 PM Erick Erickson > > >> wrote: > >> > >>> bq: ...but the collection wasn't emptied first > >>> > >>> This is what I'd suspect is the problem. Here's the issue: Segments > >>> aren't merged identically on all replicas. So at some point you had > >>> this field indexed without docValues, changed that and re-indexed. But > >>>
Re: Inconsistent results for facet queries
Never mind. Anything that didn't merge old segments, just threw them away when empty (which was my idea) would possibly require as much disk space as the index currently occupied, so doesn't help your disk-constrained situation. Best, Erick On Thu, Oct 12, 2017 at 8:06 AM, Erick Ericksonwrote: > If it's _only_ on a particular replica, here's what you could do: > Just DELETEREPLICA on it, then ADDREPLICA to bring it back. You can > define the "node" parameter on ADDREPLICA to get it back on the same > node. Then the normal replication process would pull the entire index > down from the leader. > > My bet, though, is that this wouldn't really fix things. While it fixes the > particular case you've noticed I'd guess others would pop up. You can > see what replicas return what by firing individual queries at the > particular replica in question with =false, something like > solr_server:port/solr/collection1_shard1_replica1/query?distrib=false > blah blah > > > bq: It is exceedingly unfortunate that reindexing the data on that shard only > probably won't end up fixing the problem > > Well, we've been working on the DWIM (Do What I Mean) feature for years, > but progress has stalled. > > How would that work? You have two segments with vastly different > characteristics for a field. You could change the type, the multiValued-ness, > the analysis chain, there's no end to the things that could go wrong. Fixing > them actually _is_ impossible given how Lucene is structured. > > H, you've now given me a brainstorm I'll suggest on the JIRA > system after I talk to the dev list > > Consider indexed=true stored=false. After stemming, "running" can be > indexed as "run". At merge time you have no way of knowing that > "running" was the original term so you simply couldn't fix it on merge, > not to mention that the performance penalty would be...er... > severe. > > Best, > Erick > > On Thu, Oct 12, 2017 at 5:53 AM, Chris Ulicny wrote: >> I thought that decision would come back to bite us somehow. At the time, we >> didn't have enough space available to do a fresh reindex alongside the old >> collection, so the only course of action available was to index over the >> old one, and the vast majority of its use worked as expected. >> >> We're planning on upgrading to version 7 at some point in the near future >> and will have enough space to do a full, clean reindex at that time. >> >> bq: This can propagate through all following segment merges IIUC. >> >> It is exceedingly unfortunate that reindexing the data on that shard only >> probably won't end up fixing the problem. >> >> Out of curiosity, are there any good write-ups or documentation on how two >> (or more) lucene segments are merged, or is it just worth looking at the >> source code to figure that out? >> >> Thanks, >> Chris >> >> On Wed, Oct 11, 2017 at 6:55 PM Erick Erickson >> wrote: >> >>> bq: ...but the collection wasn't emptied first >>> >>> This is what I'd suspect is the problem. Here's the issue: Segments >>> aren't merged identically on all replicas. So at some point you had >>> this field indexed without docValues, changed that and re-indexed. But >>> the segment merging could "read" the first segment it's going to merge >>> and think it knows about docValues for that field, when in fact that >>> segment had the old (non-DV) definition. >>> >>> This would not necessarily be the same on all replicas even on the _same_ >>> shard. >>> >>> This can propagate through all following segment merges IIUC. >>> >>> So my bet is that if you index into a new collection, everything will >>> be fine. You can also just delete everything first, but I usually >>> prefer a new collection so I'm absolutely and positively sure that the >>> above can't happen. >>> >>> Best, >>> Erick >>> >>> On Wed, Oct 11, 2017 at 12:51 PM, Chris Ulicny wrote: >>> > Hi, >>> > >>> > We've run into a strange issue with our deployment of solrcloud 6.3.0. >>> > Essentially, a standard facet query on a string field usually comes back >>> > empty when it shouldn't. However, every now and again the query actually >>> > returns the correct values. This is only affecting a single shard in our >>> > setup. >>> > >>> > The behavior pattern generally looks like the query works properly when >>> it >>> > hasn't been run recently, and then returns nothing after the query seems >>> to >>> > have been cached (< 50ms QTime). Wait a while and you get the correct >>> > result followed by blanks. It doesn't matter which replica of the shard >>> is >>> > queried; the results are the same. >>> > >>> > The general query in question looks like >>> > /select?q=*:*=true=market=0= >>> > >>> > The field is defined in the schema as >> > docValues="true"/> >>> > >>> > There are numerous other fields defined similarly, and they do not >>> exhibit >>> > the same behavior when used as the facet.field value. They
Re: Inconsistent results for facet queries
If it's _only_ on a particular replica, here's what you could do: Just DELETEREPLICA on it, then ADDREPLICA to bring it back. You can define the "node" parameter on ADDREPLICA to get it back on the same node. Then the normal replication process would pull the entire index down from the leader. My bet, though, is that this wouldn't really fix things. While it fixes the particular case you've noticed I'd guess others would pop up. You can see what replicas return what by firing individual queries at the particular replica in question with =false, something like solr_server:port/solr/collection1_shard1_replica1/query?distrib=false blah blah bq: It is exceedingly unfortunate that reindexing the data on that shard only probably won't end up fixing the problem Well, we've been working on the DWIM (Do What I Mean) feature for years, but progress has stalled. How would that work? You have two segments with vastly different characteristics for a field. You could change the type, the multiValued-ness, the analysis chain, there's no end to the things that could go wrong. Fixing them actually _is_ impossible given how Lucene is structured. H, you've now given me a brainstorm I'll suggest on the JIRA system after I talk to the dev list Consider indexed=true stored=false. After stemming, "running" can be indexed as "run". At merge time you have no way of knowing that "running" was the original term so you simply couldn't fix it on merge, not to mention that the performance penalty would be...er... severe. Best, Erick On Thu, Oct 12, 2017 at 5:53 AM, Chris Ulicnywrote: > I thought that decision would come back to bite us somehow. At the time, we > didn't have enough space available to do a fresh reindex alongside the old > collection, so the only course of action available was to index over the > old one, and the vast majority of its use worked as expected. > > We're planning on upgrading to version 7 at some point in the near future > and will have enough space to do a full, clean reindex at that time. > > bq: This can propagate through all following segment merges IIUC. > > It is exceedingly unfortunate that reindexing the data on that shard only > probably won't end up fixing the problem. > > Out of curiosity, are there any good write-ups or documentation on how two > (or more) lucene segments are merged, or is it just worth looking at the > source code to figure that out? > > Thanks, > Chris > > On Wed, Oct 11, 2017 at 6:55 PM Erick Erickson > wrote: > >> bq: ...but the collection wasn't emptied first >> >> This is what I'd suspect is the problem. Here's the issue: Segments >> aren't merged identically on all replicas. So at some point you had >> this field indexed without docValues, changed that and re-indexed. But >> the segment merging could "read" the first segment it's going to merge >> and think it knows about docValues for that field, when in fact that >> segment had the old (non-DV) definition. >> >> This would not necessarily be the same on all replicas even on the _same_ >> shard. >> >> This can propagate through all following segment merges IIUC. >> >> So my bet is that if you index into a new collection, everything will >> be fine. You can also just delete everything first, but I usually >> prefer a new collection so I'm absolutely and positively sure that the >> above can't happen. >> >> Best, >> Erick >> >> On Wed, Oct 11, 2017 at 12:51 PM, Chris Ulicny wrote: >> > Hi, >> > >> > We've run into a strange issue with our deployment of solrcloud 6.3.0. >> > Essentially, a standard facet query on a string field usually comes back >> > empty when it shouldn't. However, every now and again the query actually >> > returns the correct values. This is only affecting a single shard in our >> > setup. >> > >> > The behavior pattern generally looks like the query works properly when >> it >> > hasn't been run recently, and then returns nothing after the query seems >> to >> > have been cached (< 50ms QTime). Wait a while and you get the correct >> > result followed by blanks. It doesn't matter which replica of the shard >> is >> > queried; the results are the same. >> > >> > The general query in question looks like >> > /select?q=*:*=true=market=0= >> > >> > The field is defined in the schema as > > docValues="true"/> >> > >> > There are numerous other fields defined similarly, and they do not >> exhibit >> > the same behavior when used as the facet.field value. They consistently >> > return the right results on the shard in question. >> > >> > If we add facet.method=enum to the query, we get the correct results >> every >> > time (though slower. So our assumption is that something is sporadically >> > working when the fc method is chosen by default. >> > >> > A few other notes about the collection. This collection is not freshly >> > indexed, but has not had any particularly bad failures beyond follower >> > replicas going down
Re: Inconsistent results for facet queries
I thought that decision would come back to bite us somehow. At the time, we didn't have enough space available to do a fresh reindex alongside the old collection, so the only course of action available was to index over the old one, and the vast majority of its use worked as expected. We're planning on upgrading to version 7 at some point in the near future and will have enough space to do a full, clean reindex at that time. bq: This can propagate through all following segment merges IIUC. It is exceedingly unfortunate that reindexing the data on that shard only probably won't end up fixing the problem. Out of curiosity, are there any good write-ups or documentation on how two (or more) lucene segments are merged, or is it just worth looking at the source code to figure that out? Thanks, Chris On Wed, Oct 11, 2017 at 6:55 PM Erick Ericksonwrote: > bq: ...but the collection wasn't emptied first > > This is what I'd suspect is the problem. Here's the issue: Segments > aren't merged identically on all replicas. So at some point you had > this field indexed without docValues, changed that and re-indexed. But > the segment merging could "read" the first segment it's going to merge > and think it knows about docValues for that field, when in fact that > segment had the old (non-DV) definition. > > This would not necessarily be the same on all replicas even on the _same_ > shard. > > This can propagate through all following segment merges IIUC. > > So my bet is that if you index into a new collection, everything will > be fine. You can also just delete everything first, but I usually > prefer a new collection so I'm absolutely and positively sure that the > above can't happen. > > Best, > Erick > > On Wed, Oct 11, 2017 at 12:51 PM, Chris Ulicny wrote: > > Hi, > > > > We've run into a strange issue with our deployment of solrcloud 6.3.0. > > Essentially, a standard facet query on a string field usually comes back > > empty when it shouldn't. However, every now and again the query actually > > returns the correct values. This is only affecting a single shard in our > > setup. > > > > The behavior pattern generally looks like the query works properly when > it > > hasn't been run recently, and then returns nothing after the query seems > to > > have been cached (< 50ms QTime). Wait a while and you get the correct > > result followed by blanks. It doesn't matter which replica of the shard > is > > queried; the results are the same. > > > > The general query in question looks like > > /select?q=*:*=true=market=0= > > > > The field is defined in the schema as > docValues="true"/> > > > > There are numerous other fields defined similarly, and they do not > exhibit > > the same behavior when used as the facet.field value. They consistently > > return the right results on the shard in question. > > > > If we add facet.method=enum to the query, we get the correct results > every > > time (though slower. So our assumption is that something is sporadically > > working when the fc method is chosen by default. > > > > A few other notes about the collection. This collection is not freshly > > indexed, but has not had any particularly bad failures beyond follower > > replicas going down due to PKIAuthentication timeouts (has been fixed). > It > > has also had a full reindex after a schema change added docValues some > > fields (including the one above), but the collection wasn't emptied > first. > > We are using the composite router to co-locate documents. > > > > Currently, our plan is just to reindex all of the documents on the > affected > > shard to see if that fixes the problem. Any ideas on what might be > > happening or ways to troubleshoot this are appreciated. > > > > Thanks, > > Chris >
Re: Inconsistent results for facet queries
bq: ...but the collection wasn't emptied first This is what I'd suspect is the problem. Here's the issue: Segments aren't merged identically on all replicas. So at some point you had this field indexed without docValues, changed that and re-indexed. But the segment merging could "read" the first segment it's going to merge and think it knows about docValues for that field, when in fact that segment had the old (non-DV) definition. This would not necessarily be the same on all replicas even on the _same_ shard. This can propagate through all following segment merges IIUC. So my bet is that if you index into a new collection, everything will be fine. You can also just delete everything first, but I usually prefer a new collection so I'm absolutely and positively sure that the above can't happen. Best, Erick On Wed, Oct 11, 2017 at 12:51 PM, Chris Ulicnywrote: > Hi, > > We've run into a strange issue with our deployment of solrcloud 6.3.0. > Essentially, a standard facet query on a string field usually comes back > empty when it shouldn't. However, every now and again the query actually > returns the correct values. This is only affecting a single shard in our > setup. > > The behavior pattern generally looks like the query works properly when it > hasn't been run recently, and then returns nothing after the query seems to > have been cached (< 50ms QTime). Wait a while and you get the correct > result followed by blanks. It doesn't matter which replica of the shard is > queried; the results are the same. > > The general query in question looks like > /select?q=*:*=true=market=0= > > The field is defined in the schema as docValues="true"/> > > There are numerous other fields defined similarly, and they do not exhibit > the same behavior when used as the facet.field value. They consistently > return the right results on the shard in question. > > If we add facet.method=enum to the query, we get the correct results every > time (though slower. So our assumption is that something is sporadically > working when the fc method is chosen by default. > > A few other notes about the collection. This collection is not freshly > indexed, but has not had any particularly bad failures beyond follower > replicas going down due to PKIAuthentication timeouts (has been fixed). It > has also had a full reindex after a schema change added docValues some > fields (including the one above), but the collection wasn't emptied first. > We are using the composite router to co-locate documents. > > Currently, our plan is just to reindex all of the documents on the affected > shard to see if that fixes the problem. Any ideas on what might be > happening or ways to troubleshoot this are appreciated. > > Thanks, > Chris
Inconsistent results for facet queries
Hi, We've run into a strange issue with our deployment of solrcloud 6.3.0. Essentially, a standard facet query on a string field usually comes back empty when it shouldn't. However, every now and again the query actually returns the correct values. This is only affecting a single shard in our setup. The behavior pattern generally looks like the query works properly when it hasn't been run recently, and then returns nothing after the query seems to have been cached (< 50ms QTime). Wait a while and you get the correct result followed by blanks. It doesn't matter which replica of the shard is queried; the results are the same. The general query in question looks like /select?q=*:*=true=market=0= The field is defined in the schema as There are numerous other fields defined similarly, and they do not exhibit the same behavior when used as the facet.field value. They consistently return the right results on the shard in question. If we add facet.method=enum to the query, we get the correct results every time (though slower. So our assumption is that something is sporadically working when the fc method is chosen by default. A few other notes about the collection. This collection is not freshly indexed, but has not had any particularly bad failures beyond follower replicas going down due to PKIAuthentication timeouts (has been fixed). It has also had a full reindex after a schema change added docValues some fields (including the one above), but the collection wasn't emptied first. We are using the composite router to co-locate documents. Currently, our plan is just to reindex all of the documents on the affected shard to see if that fixes the problem. Any ideas on what might be happening or ways to troubleshoot this are appreciated. Thanks, Chris
Re: Inconsistent results with solr admin ui and solrj
rote: >>>> Could it be that your cluster is not in sync, so that when Solr picks >>>> three nodes, results will vary depending on what replica answers? >>>> >>>> A few questions: >>>> >>>> a) Is your index static, i.e. not being updated live? >>>> b) Can you try to go directly to the core menu of both replicas for each >>>> shard, and compare numDocs / maxDocs for each? Both replicas in each shard >>>> should have same count. >>>> c) What are you querying on and sorting by? Does it happen with only one >>>> query and sorting? >>>> d) Are there any errors in the logs? >>>> >>>> If possible, please share some queries, responses, config, screenshots etc. >>>> >>>> -- >>>> Jan Høydahl, search solution architect >>>> Cominvent AS - www.cominvent.com >>>> >>>>> 13. aug. 2016 kl. 12.10 skrev Pranaya Behera <pranaya.beh...@igp.com>: >>>>> >>>>> Hi, >>>>>I am running solr 6.1.0 with solrcloud. We have 3 instance of >>>>> zookeeper and 3 instance of solrcloud. All three of them are active and >>>>> up. One collection has 3 shards, each shard has 2 replicas. >>>>> >>>>> Everytime query whether from solrj or admin ui, getting inconsistent >>>>> results. e.g. >>>>> 1. numFound is always fluctuating. >>>>> 2. facet count shows the count for a field, filter query on that field >>>>> gets 0 results. >>>>> 3. luke requests work(not sure whether gives correct info of all the >>>>> dynamic field) on per shard not on collection when invoked from curl but >>>>> doesnt work when called from solrj. >>>>> 4. admin ui shows expanded results, same query goes from solrj, >>>>> getExpandedResults() gives 0 docs. >>>>> >>>>> What would be cause of all this ? Any pointer to look for an error >>>>> anything in the logs. >> >
Re: Inconsistent results with solr admin ui and solrj
Hi, I did as you said, now it is coming ok. And what are the things to look for while checking about these kind of issues, such as mismatch count, lukerequest not returning all the fields etc. The doc sync is one, how can I programmatically use the info and sync them ? Is there any method in solrj? On 16/08/16 14:50, Jan Høydahl wrote: Hi, There is clearly something wrong when your two replicas are not in sync. Could you go to the “Cloud->Tree” tab of admin UI and look in the overseer queue whether you find signs of stuck jobs or something? Btw - what warnings do you see in the logs? Anything repeatedly popping up? I would also try the following: 1. Take down the node hosting replica 1 (assuming that replica2 is the correct, most current) 2. Manually empty the data folder 3. Take the node up again 4. Verify that a full index recovery happens, and that they get back in sync 5. Run your indexing procedure. 6. Verify that both replicas are still in sync -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com 16. aug. 2016 kl. 06.51 skrev Pranaya Behera <pranaya.beh...@igp.com>: Hi, a.) Yes index is static, not updated live. We index new documents over old documents by this sequesce, deleteall docs, add 10 freshly fetched from db, after adding all the docs to cloud instance, commit. Commit happens only once per collection, b.) I took one shard and below are the results for the each replica, it has 2 replica. Replica - 2 Last Modified: 33 minutes ago Num Docs: 127970 Max Doc: 127970 Heap Memory Usage: -1 Deleted Docs: 0 Version: 14530 Segment Count: 5 Optimized: yes Current: yes Data: /var/solr/data/product_shard1_replica2/data Index: /var/solr/data/product_shard1_replica2/data/index.20160816040537452 Impl: org.apache.solr.core.NRTCachingDirectoryFactory Replica - 1 Last Modified: about 19 hours ago Num Docs: 234013 Max Doc: 234013 Heap Memory Usage: -1 Deleted Docs: 0 Version: 14272 Segment Count: 7 Optimized: yes Current: no Data: /var/solr/data/product_shard1_replica1/data Index: /var/solr/data/product_shard1_replica1/data/index Impl: org.apache.solr.core.NRTCachingDirectoryFactory c.) With the admin ui: if I query for all, *:* it gives different numFound each time. e.g. 1. |{ "responseHeader":{ "zkConnected":true, "status":0, "QTime":7, "params":{ "q":"*:*", "indent":"on", "wt":"json", "_":"1471322871767"}}, "response":{"numFound":452300,"start":0,"maxScore":1.0, 2. | |{ "responseHeader":{ "zkConnected":true, "status":0, "QTime":23, "params":{ "q":"*:*", "indent":"on", "wt":"json", "_":"1471322871767"}}, "response":{"numFound":574013,"start":0,"maxScore":1.0, This is queried live from the solr instances. | It happens with any type of queries, if I search in parent document or search through child documents to get parents. sorting is used in both cases but with different field, while doingblock join query sortingis on the child document field, otherwise on the parent document field. d.) I dont find any errors in the logs. All warnings only. On 14/08/16 02:56, Jan Høydahl wrote: Could it be that your cluster is not in sync, so that when Solr picks three nodes, results will vary depending on what replica answers? A few questions: a) Is your index static, i.e. not being updated live? b) Can you try to go directly to the core menu of both replicas for each shard, and compare numDocs / maxDocs for each? Both replicas in each shard should have same count. c) What are you querying on and sorting by? Does it happen with only one query and sorting? d) Are there any errors in the logs? If possible, please share some queries, responses, config, screenshots etc. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com 13. aug. 2016 kl. 12.10 skrev Pranaya Behera <pranaya.beh...@igp.com>: Hi, I am running solr 6.1.0 with solrcloud. We have 3 instance of zookeeper and 3 instance of solrcloud. All three of them are active and up. One collection has 3 shards, each shard has 2 replicas. Everytime query whether from solrj or admin ui, getting inconsistent results. e.g. 1. numFound is always fluctuating. 2. facet count shows the count for a field, filter query on that field gets 0 results. 3. luke requests work(not sure whether gives correct info of all the dynamic field) on per shard not on collection when invoked from curl but doesnt work when called from solrj. 4. admin ui shows expanded results, same query goes from solrj, getExpandedResults() gives 0 docs. What would be cause of all this ? Any pointer to look for an error anything in the logs.
Re: Inconsistent results with solr admin ui and solrj
Hi, There is clearly something wrong when your two replicas are not in sync. Could you go to the “Cloud->Tree” tab of admin UI and look in the overseer queue whether you find signs of stuck jobs or something? Btw - what warnings do you see in the logs? Anything repeatedly popping up? I would also try the following: 1. Take down the node hosting replica 1 (assuming that replica2 is the correct, most current) 2. Manually empty the data folder 3. Take the node up again 4. Verify that a full index recovery happens, and that they get back in sync 5. Run your indexing procedure. 6. Verify that both replicas are still in sync -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 16. aug. 2016 kl. 06.51 skrev Pranaya Behera <pranaya.beh...@igp.com>: > > Hi, > a.) Yes index is static, not updated live. We index new documents over old > documents by this sequesce, deleteall docs, add 10 freshly fetched from db, > after adding all the docs to cloud instance, commit. Commit happens only once > per collection, > b.) I took one shard and below are the results for the each replica, it has 2 > replica. > Replica - 2 > Last Modified: 33 minutes ago > Num Docs: 127970 > Max Doc: 127970 > Heap Memory Usage: -1 > Deleted Docs: 0 > Version: 14530 > Segment Count: 5 > Optimized: yes > Current: yes > Data: /var/solr/data/product_shard1_replica2/data > Index: /var/solr/data/product_shard1_replica2/data/index.20160816040537452 > Impl: org.apache.solr.core.NRTCachingDirectoryFactory > > Replica - 1 > Last Modified: about 19 hours ago > Num Docs: 234013 > Max Doc: 234013 > Heap Memory Usage: -1 > Deleted Docs: 0 > Version: 14272 > Segment Count: 7 > Optimized: yes > Current: no > Data: /var/solr/data/product_shard1_replica1/data > Index: /var/solr/data/product_shard1_replica1/data/index > Impl: org.apache.solr.core.NRTCachingDirectoryFactory > > c.) With the admin ui: if I query for all, *:* it gives different numFound > each time. > e.g. > 1. > > |{ "responseHeader":{ "zkConnected":true, "status":0, "QTime":7, "params":{ > "q":"*:*", "indent":"on", "wt":"json", "_":"1471322871767"}}, > "response":{"numFound":452300,"start":0,"maxScore":1.0, 2. | > |{ "responseHeader":{ "zkConnected":true, "status":0, "QTime":23, "params":{ > "q":"*:*", "indent":"on", "wt":"json", "_":"1471322871767"}}, > "response":{"numFound":574013,"start":0,"maxScore":1.0, This is queried live > from the solr instances. | > > It happens with any type of queries, if I search in parent document or search > through child documents to get parents. sorting is used in both cases but > with different field, while doingblock join query sortingis on the child > document field, otherwise on the parent document field. > > d.) I dont find any errors in the logs. All warnings only. > > On 14/08/16 02:56, Jan Høydahl wrote: >> Could it be that your cluster is not in sync, so that when Solr picks three >> nodes, results will vary depending on what replica answers? >> >> A few questions: >> >> a) Is your index static, i.e. not being updated live? >> b) Can you try to go directly to the core menu of both replicas for each >> shard, and compare numDocs / maxDocs for each? Both replicas in each shard >> should have same count. >> c) What are you querying on and sorting by? Does it happen with only one >> query and sorting? >> d) Are there any errors in the logs? >> >> If possible, please share some queries, responses, config, screenshots etc. >> >> -- >> Jan Høydahl, search solution architect >> Cominvent AS - www.cominvent.com >> >>> 13. aug. 2016 kl. 12.10 skrev Pranaya Behera <pranaya.beh...@igp.com>: >>> >>> Hi, >>>I am running solr 6.1.0 with solrcloud. We have 3 instance of zookeeper >>> and 3 instance of solrcloud. All three of them are active and up. One >>> collection has 3 shards, each shard has 2 replicas. >>> >>> Everytime query whether from solrj or admin ui, getting inconsistent >>> results. e.g. >>> 1. numFound is always fluctuating. >>> 2. facet count shows the count for a field, filter query on that field gets >>> 0 results. >>> 3. luke requests work(not sure whether gives correct info of all the >>> dynamic field) on per shard not on collection when invoked from curl but >>> doesnt work when called from solrj. >>> 4. admin ui shows expanded results, same query goes from solrj, >>> getExpandedResults() gives 0 docs. >>> >>> What would be cause of all this ? Any pointer to look for an error anything >>> in the logs. >> >
Re: Inconsistent results with solr admin ui and solrj
Hi, a.) Yes index is static, not updated live. We index new documents over old documents by this sequesce, deleteall docs, add 10 freshly fetched from db, after adding all the docs to cloud instance, commit. Commit happens only once per collection, b.) I took one shard and below are the results for the each replica, it has 2 replica. Replica - 2 Last Modified: 33 minutes ago Num Docs: 127970 Max Doc: 127970 Heap Memory Usage: -1 Deleted Docs: 0 Version: 14530 Segment Count: 5 Optimized: yes Current: yes Data: /var/solr/data/product_shard1_replica2/data Index: /var/solr/data/product_shard1_replica2/data/index.20160816040537452 Impl: org.apache.solr.core.NRTCachingDirectoryFactory Replica - 1 Last Modified: about 19 hours ago Num Docs: 234013 Max Doc: 234013 Heap Memory Usage: -1 Deleted Docs: 0 Version: 14272 Segment Count: 7 Optimized: yes Current: no Data: /var/solr/data/product_shard1_replica1/data Index: /var/solr/data/product_shard1_replica1/data/index Impl: org.apache.solr.core.NRTCachingDirectoryFactory c.) With the admin ui: if I query for all, *:* it gives different numFound each time. e.g. 1. |{ "responseHeader":{ "zkConnected":true, "status":0, "QTime":7, "params":{ "q":"*:*", "indent":"on", "wt":"json", "_":"1471322871767"}}, "response":{"numFound":452300,"start":0,"maxScore":1.0, 2. | |{ "responseHeader":{ "zkConnected":true, "status":0, "QTime":23, "params":{ "q":"*:*", "indent":"on", "wt":"json", "_":"1471322871767"}}, "response":{"numFound":574013,"start":0,"maxScore":1.0, This is queried live from the solr instances. | It happens with any type of queries, if I search in parent document or search through child documents to get parents. sorting is used in both cases but with different field, while doingblock join query sortingis on the child document field, otherwise on the parent document field. d.) I dont find any errors in the logs. All warnings only. On 14/08/16 02:56, Jan Høydahl wrote: Could it be that your cluster is not in sync, so that when Solr picks three nodes, results will vary depending on what replica answers? A few questions: a) Is your index static, i.e. not being updated live? b) Can you try to go directly to the core menu of both replicas for each shard, and compare numDocs / maxDocs for each? Both replicas in each shard should have same count. c) What are you querying on and sorting by? Does it happen with only one query and sorting? d) Are there any errors in the logs? If possible, please share some queries, responses, config, screenshots etc. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com 13. aug. 2016 kl. 12.10 skrev Pranaya Behera <pranaya.beh...@igp.com>: Hi, I am running solr 6.1.0 with solrcloud. We have 3 instance of zookeeper and 3 instance of solrcloud. All three of them are active and up. One collection has 3 shards, each shard has 2 replicas. Everytime query whether from solrj or admin ui, getting inconsistent results. e.g. 1. numFound is always fluctuating. 2. facet count shows the count for a field, filter query on that field gets 0 results. 3. luke requests work(not sure whether gives correct info of all the dynamic field) on per shard not on collection when invoked from curl but doesnt work when called from solrj. 4. admin ui shows expanded results, same query goes from solrj, getExpandedResults() gives 0 docs. What would be cause of all this ? Any pointer to look for an error anything in the logs.
Re: Inconsistent results with solr admin ui and solrj
Could it be that your cluster is not in sync, so that when Solr picks three nodes, results will vary depending on what replica answers? A few questions: a) Is your index static, i.e. not being updated live? b) Can you try to go directly to the core menu of both replicas for each shard, and compare numDocs / maxDocs for each? Both replicas in each shard should have same count. c) What are you querying on and sorting by? Does it happen with only one query and sorting? d) Are there any errors in the logs? If possible, please share some queries, responses, config, screenshots etc. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 13. aug. 2016 kl. 12.10 skrev Pranaya Behera <pranaya.beh...@igp.com>: > > Hi, >I am running solr 6.1.0 with solrcloud. We have 3 instance of zookeeper > and 3 instance of solrcloud. All three of them are active and up. One > collection has 3 shards, each shard has 2 replicas. > > Everytime query whether from solrj or admin ui, getting inconsistent results. > e.g. > 1. numFound is always fluctuating. > 2. facet count shows the count for a field, filter query on that field gets 0 > results. > 3. luke requests work(not sure whether gives correct info of all the dynamic > field) on per shard not on collection when invoked from curl but doesnt work > when called from solrj. > 4. admin ui shows expanded results, same query goes from solrj, > getExpandedResults() gives 0 docs. > > What would be cause of all this ? Any pointer to look for an error anything > in the logs.
Re: Inconsistent results with solr admin ui and solrj
Wire shark should show you what HTTP request actually looks like. So, a definite reference. I still recommend double checking that equivalence first. It is just sanity check before doing any more expensive digging. You can also enable trace logging in the admin ui to see low level request details to compare, but I don't remember which particular element right now. Regards, Alex On 13 Aug 2016 10:14 PM, "Pranaya Behera" <pranaya.beh...@igp.com> wrote: Hi Alexandre, I am sure I am firing the same queries with the same collection everytime. How do WireShark will help ? I am sorry not experienced with that tool. On 13/08/16 17:37, Alexandre Rafalovitch wrote: > Are you sure you are issuing the same queries to the same collections and > the same request handlers. > > I would verify that before all else. Using network sniffers (Wireshark) if > necessary. > > Regards, > Alex > > On 13 Aug 2016 8:11 PM, "Pranaya Behera" <pranaya.beh...@igp.com> wrote: > > Hi, > I am running solr 6.1.0 with solrcloud. We have 3 instance of > zookeeper > and 3 instance of solrcloud. All three of them are active and up. One > collection has 3 shards, each shard has 2 replicas. > > Everytime query whether from solrj or admin ui, getting inconsistent > results. e.g. > 1. numFound is always fluctuating. > 2. facet count shows the count for a field, filter query on that field gets > 0 results. > 3. luke requests work(not sure whether gives correct info of all the > dynamic field) on per shard not on collection when invoked from curl but > doesnt work when called from solrj. > 4. admin ui shows expanded results, same query goes from solrj, > getExpandedResults() gives 0 docs. > > What would be cause of all this ? Any pointer to look for an error anything > in the logs. > >
Re: Inconsistent results with solr admin ui and solrj
Hi Alexandre, I am sure I am firing the same queries with the same collection everytime. How do WireShark will help ? I am sorry not experienced with that tool. On 13/08/16 17:37, Alexandre Rafalovitch wrote: Are you sure you are issuing the same queries to the same collections and the same request handlers. I would verify that before all else. Using network sniffers (Wireshark) if necessary. Regards, Alex On 13 Aug 2016 8:11 PM, "Pranaya Behera" <pranaya.beh...@igp.com> wrote: Hi, I am running solr 6.1.0 with solrcloud. We have 3 instance of zookeeper and 3 instance of solrcloud. All three of them are active and up. One collection has 3 shards, each shard has 2 replicas. Everytime query whether from solrj or admin ui, getting inconsistent results. e.g. 1. numFound is always fluctuating. 2. facet count shows the count for a field, filter query on that field gets 0 results. 3. luke requests work(not sure whether gives correct info of all the dynamic field) on per shard not on collection when invoked from curl but doesnt work when called from solrj. 4. admin ui shows expanded results, same query goes from solrj, getExpandedResults() gives 0 docs. What would be cause of all this ? Any pointer to look for an error anything in the logs.
Re: Inconsistent results with solr admin ui and solrj
Are you sure you are issuing the same queries to the same collections and the same request handlers. I would verify that before all else. Using network sniffers (Wireshark) if necessary. Regards, Alex On 13 Aug 2016 8:11 PM, "Pranaya Behera" <pranaya.beh...@igp.com> wrote: Hi, I am running solr 6.1.0 with solrcloud. We have 3 instance of zookeeper and 3 instance of solrcloud. All three of them are active and up. One collection has 3 shards, each shard has 2 replicas. Everytime query whether from solrj or admin ui, getting inconsistent results. e.g. 1. numFound is always fluctuating. 2. facet count shows the count for a field, filter query on that field gets 0 results. 3. luke requests work(not sure whether gives correct info of all the dynamic field) on per shard not on collection when invoked from curl but doesnt work when called from solrj. 4. admin ui shows expanded results, same query goes from solrj, getExpandedResults() gives 0 docs. What would be cause of all this ? Any pointer to look for an error anything in the logs.
Re: Inconsistent results with solr admin ui and solrj
Hi, I am using Java client i.e. SorlJ. On 13/08/16 16:31, GW wrote: No offense intended, but you are looking at a problem with your work. You need to explain what you are doing not what is happening. If you are trying to use PHP and the latest PECL/PEAR, it does not work so well. It is considerably older than Solr 6.1. This was the only issue I ran into with 6.1. On 13 August 2016 at 06:10, Pranaya Behera <pranaya.beh...@igp.com> wrote: Hi, I am running solr 6.1.0 with solrcloud. We have 3 instance of zookeeper and 3 instance of solrcloud. All three of them are active and up. One collection has 3 shards, each shard has 2 replicas. Everytime query whether from solrj or admin ui, getting inconsistent results. e.g. 1. numFound is always fluctuating. 2. facet count shows the count for a field, filter query on that field gets 0 results. 3. luke requests work(not sure whether gives correct info of all the dynamic field) on per shard not on collection when invoked from curl but doesnt work when called from solrj. 4. admin ui shows expanded results, same query goes from solrj, getExpandedResults() gives 0 docs. What would be cause of all this ? Any pointer to look for an error anything in the logs.
Re: Inconsistent results with solr admin ui and solrj
No offense intended, but you are looking at a problem with your work. You need to explain what you are doing not what is happening. If you are trying to use PHP and the latest PECL/PEAR, it does not work so well. It is considerably older than Solr 6.1. This was the only issue I ran into with 6.1. On 13 August 2016 at 06:10, Pranaya Behera <pranaya.beh...@igp.com> wrote: > Hi, > I am running solr 6.1.0 with solrcloud. We have 3 instance of > zookeeper and 3 instance of solrcloud. All three of them are active and up. > One collection has 3 shards, each shard has 2 replicas. > > Everytime query whether from solrj or admin ui, getting inconsistent > results. e.g. > 1. numFound is always fluctuating. > 2. facet count shows the count for a field, filter query on that field > gets 0 results. > 3. luke requests work(not sure whether gives correct info of all the > dynamic field) on per shard not on collection when invoked from curl but > doesnt work when called from solrj. > 4. admin ui shows expanded results, same query goes from solrj, > getExpandedResults() gives 0 docs. > > What would be cause of all this ? Any pointer to look for an error > anything in the logs. >
Inconsistent results with solr admin ui and solrj
Hi, I am running solr 6.1.0 with solrcloud. We have 3 instance of zookeeper and 3 instance of solrcloud. All three of them are active and up. One collection has 3 shards, each shard has 2 replicas. Everytime query whether from solrj or admin ui, getting inconsistent results. e.g. 1. numFound is always fluctuating. 2. facet count shows the count for a field, filter query on that field gets 0 results. 3. luke requests work(not sure whether gives correct info of all the dynamic field) on per shard not on collection when invoked from curl but doesnt work when called from solrj. 4. admin ui shows expanded results, same query goes from solrj, getExpandedResults() gives 0 docs. What would be cause of all this ? Any pointer to look for an error anything in the logs.
Re: Solr cloud with Grouping query gives inconsistent results
Hi Mary, Yes the field used for grouping is stored=true. Thanks Preeti On Wed, May 25, 2016 at 7:04 PM, Mary Whitewrote: > Hi Preeti, > > Do you have stored=true on the field you are trying to query? > > Sent from my iPhone > > > On May 25, 2016, at 8:30 AM, preeti kumari > wrote: > > > > Thanks Jeff. Let me try this .I was actually looking for a way without > doc > > routing. > > Do let me know if I can handle grouping through queries. > > > > > > Thanks > > Preeti > > > >> On Tue, May 24, 2016 at 2:08 AM, Jeff Wartes > wrote: > >> > >> My first thought is that you haven’t indexed such that all values of the > >> field you’re grouping on are found in the same cores. > >> > >> See the end of the article here: (Distributed Result Grouping Caveats) > >> https://cwiki.apache.org/confluence/display/solr/Result+Grouping > >> > >> And the “Document Routing” section here: > >> > >> > https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud > >> > >> If I’m right, you haven’t used the “amid” field as part of your doc > >> routing policy. > >> > >> > >> > >>> On 5/23/16, 3:57 AM, "preeti kumari" wrote: > >>> > >>> Hi All, > >>> > >>> I am using grouping query with solr cloud version 5.2.1 . > >>> Parameters added in my query is > >>> =SIM*group=true=amid=1=true. But > each > >>> time I hit the query i get different results i.e top 10 results are > >>> different each time. > >>> > >>> Why is it so ? Please help me with this. > >>> Is there any way by which I can get consistent results from grouping > query > >>> in solr cloud. > >>> > >>> Thanks > >>> Preeti > >> > >> > >
Re: Solr cloud with Grouping query gives inconsistent results
Hi Preeti, Do you have stored=true on the field you are trying to query? Sent from my iPhone > On May 25, 2016, at 8:30 AM, preeti kumariwrote: > > Thanks Jeff. Let me try this .I was actually looking for a way without doc > routing. > Do let me know if I can handle grouping through queries. > > > Thanks > Preeti > >> On Tue, May 24, 2016 at 2:08 AM, Jeff Wartes wrote: >> >> My first thought is that you haven’t indexed such that all values of the >> field you’re grouping on are found in the same cores. >> >> See the end of the article here: (Distributed Result Grouping Caveats) >> https://cwiki.apache.org/confluence/display/solr/Result+Grouping >> >> And the “Document Routing” section here: >> >> https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud >> >> If I’m right, you haven’t used the “amid” field as part of your doc >> routing policy. >> >> >> >>> On 5/23/16, 3:57 AM, "preeti kumari" wrote: >>> >>> Hi All, >>> >>> I am using grouping query with solr cloud version 5.2.1 . >>> Parameters added in my query is >>> =SIM*group=true=amid=1=true. But each >>> time I hit the query i get different results i.e top 10 results are >>> different each time. >>> >>> Why is it so ? Please help me with this. >>> Is there any way by which I can get consistent results from grouping query >>> in solr cloud. >>> >>> Thanks >>> Preeti >> >>
Re: Solr cloud with Grouping query gives inconsistent results
Thanks Jeff. Let me try this .I was actually looking for a way without doc routing. Do let me know if I can handle grouping through queries. Thanks Preeti On Tue, May 24, 2016 at 2:08 AM, Jeff Warteswrote: > My first thought is that you haven’t indexed such that all values of the > field you’re grouping on are found in the same cores. > > See the end of the article here: (Distributed Result Grouping Caveats) > https://cwiki.apache.org/confluence/display/solr/Result+Grouping > > And the “Document Routing” section here: > > https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud > > If I’m right, you haven’t used the “amid” field as part of your doc > routing policy. > > > > On 5/23/16, 3:57 AM, "preeti kumari" wrote: > > >Hi All, > > > >I am using grouping query with solr cloud version 5.2.1 . > >Parameters added in my query is > >=SIM*group=true=amid=1=true. But each > >time I hit the query i get different results i.e top 10 results are > >different each time. > > > >Why is it so ? Please help me with this. > >Is there any way by which I can get consistent results from grouping query > >in solr cloud. > > > >Thanks > >Preeti > >
Re: Solr cloud with Grouping query gives inconsistent results
My first thought is that you haven’t indexed such that all values of the field you’re grouping on are found in the same cores. See the end of the article here: (Distributed Result Grouping Caveats) https://cwiki.apache.org/confluence/display/solr/Result+Grouping And the “Document Routing” section here: https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud If I’m right, you haven’t used the “amid” field as part of your doc routing policy. On 5/23/16, 3:57 AM, "preeti kumari"wrote: >Hi All, > >I am using grouping query with solr cloud version 5.2.1 . >Parameters added in my query is >=SIM*group=true=amid=1=true. But each >time I hit the query i get different results i.e top 10 results are >different each time. > >Why is it so ? Please help me with this. >Is there any way by which I can get consistent results from grouping query >in solr cloud. > >Thanks >Preeti
Solr cloud with Grouping query gives inconsistent results
Hi All, I am using grouping query with solr cloud version 5.2.1 . Parameters added in my query is =SIM*group=true=amid=1=true. But each time I hit the query i get different results i.e top 10 results are different each time. Why is it so ? Please help me with this. Is there any way by which I can get consistent results from grouping query in solr cloud. Thanks Preeti
Newly added json facet api returning inconsistent results in distributed mode
Hi, I am new to the solr community and I am sorry if this is not the right medium to bring the issue to notice. I have found following issue : https://issues.apache.org/jira/browse/SOLR-7452 as mentioned in the subject and raised a ticket for the same. Any help is appreciated! Sent from my iPhone
Inconsistent results in a distributed configuration
I´m getting inconsistent results in a distributed configuration. Using stats command over a single core containing about 3 milion docs I´ve got 452660794509326.7 (a double type field). On the other hand, when partitioning the data into 2 or 4 cores I am getting a different result: 452660794509325.4. Has anyone faced the same problem ? Is it a misconfiguration or a bug ? Any hints ? -- View this message in context: http://lucene.472066.n3.nabble.com/Inconsistent-results-in-a-distributed-configuration-tp4116061.html Sent from the Solr - User mailing list archive at Nabble.com.
SOLR inconsistent results?
I have two solr instances. One is a master and the other a slave, polling the master every 20 seconds or so for index updates. My application mainly queries the slave, so most of the load falls to it. There are some areas of the application that do query the master, however. For instance, During the execution of an action (I am using the symfony 2 framework + solarium bundle + solarium lib) I query the master. I query not just once, but between 20-50 times during the lifetime of the execution of the action. You can assume that this amount of querying is tolerable. What occurs during the querying has left me perplexed. If I execute the action (make a page request through the browser), say twice, the set of results returned are different for each of the requests. To simplify, if the action only queried the master three times, then: page request one: (first query: 1 hits, second query: 0 hits, third query: 1 hits) page request two: (first query: 0 hits, second query: 1 hits, third query, 0 hits) There are no differences in the queries in the first page request and second (although the three queries themselves are different from each other). They are the exact same queries. I tail the request logs in the solr master instance, and it does log all of the requests, so all of the requests made by the application code are being received correctly by the master (this rules out any connection issues, application level issues), but it seems to get hits sometimes, while other times, not. When I perform the same query (that returned 0 hits during the execution of the action) in the front end solr interface, I do get the hit I am expecting. There is another server apart from the master, slave, and application, that runs a process that continuously updates the index based on changes detected in the source data - a relational database. Could anyone provide some insight on this inconsistent behavior? Why would solr produce two different results for the same query? -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-inconsistent-results-tp4035888.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
This might explain another thing I'm seeing. If I take a node down, clusterstate.json still shows it as active. Also if I'm running 4 nodes, take one down and assign it a new port, clusterstate.json will show 5 nodes running. On Sat, Mar 17, 2012 at 10:10 PM, Mark Miller markrmil...@gmail.com wrote: Nodes talk to ZooKeeper as well as to each other. You can see the addresses they are trying to use to communicate with each other in the 'cloud' view of the Solr Admin UI. Sometimes you have to override these, as the detected default may not be an address that other nodes can reach. As a limited example: for some reason my mac cannot talk to my linux box with its default detected host address of halfmetal:8983/solr - but the mac can reach my linux box if I use halfmetal.Local - so I have to override the published address of my linux box using the host attribute if I want to setup a cluster between my macbook and linux box. Each nodes talks to ZooKeeper to learn about the other nodes, including their addresses. Recovery is then done node to node using the appropriate addresses. - Mark Miller lucidimagination.com On Mar 16, 2012, at 3:00 PM, Matthew Parker wrote: I'm still having issues replicating in my work environment. Can anyone explain how the replication mechanism works? Is it communicating across ports or through zookeeper to manager the process? On Thu, Mar 8, 2012 at 10:57 PM, Matthew Parker mpar...@apogeeintegration.com wrote: All, I recreated the cluster on my machine at home (Windows 7, Java 1.6.0.23, apache-solr-4.0-2012-02-29_09-07-30) , sent some document through Manifold using its crawler, and it looks like it's replicating fine once the documents are committed. This must be related to my environment somehow. Thanks for your help. Regards, Matt On Fri, Mar 2, 2012 at 9:06 AM, Erick Erickson erickerick...@gmail.com wrote: Matt: Just for paranoia's sake, when I was playing around with this (the _version_ thing was one of my problems too) I removed the entire data directory as well as the zoo_data directory between experiments (and recreated just the data dir). This included various index.2012 files and the tlog directory on the theory that *maybe* there was some confusion happening on startup with an already-wonky index. If you have the energy and tried that it might be helpful information, but it may also be a total red-herring FWIW Erick On Thu, Mar 1, 2012 at 8:28 PM, Mark Miller markrmil...@gmail.com wrote: I assuming the windows configuration looked correct? Yeah, so far I can not spot any smoking gun...I'm confounded at the moment. I'll re read through everything once more... - Mark -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration.
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
From every node in your cluster you can hit http://MACHINE1:8084/solr in your browser and get a response? On Mar 18, 2012, at 1:46 PM, Matthew Parker wrote: My cloud instance finally tried to sync. It looks like it's having connection issues, but I can bring the SOLR instance up in the browser so I'm not sure why it cannot connect to it. I got the following condensed log output: org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect Retrying request shard update error StdNode: http://MACHINE1:8084/solr/:org.apache.solr.client.solrj.SolrServerException: http://MACHINE1:8084/solr at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java: 483) .. .. .. Caused by: java.net.ConnectException: Connection refused: connect at java.net.DualStackPlainSocketImpl.connect0(Native Method) .. .. .. try and ask http://MACHINE1:8084/solr to recover Could not tell a replica to recover org.apache.solr.client.solrj.SolrServerException: http://MACHINE1:8084/solr at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:483) ... ... ... Caused by: java.net.ConnectException: Connection refused: connect at java.net.DualStackPlainSocketImpl.waitForConnect(Native method) .. .. .. On Sat, Mar 17, 2012 at 10:10 PM, Mark Miller markrmil...@gmail.com wrote: Nodes talk to ZooKeeper as well as to each other. You can see the addresses they are trying to use to communicate with each other in the 'cloud' view of the Solr Admin UI. Sometimes you have to override these, as the detected default may not be an address that other nodes can reach. As a limited example: for some reason my mac cannot talk to my linux box with its default detected host address of halfmetal:8983/solr - but the mac can reach my linux box if I use halfmetal.Local - so I have to override the published address of my linux box using the host attribute if I want to setup a cluster between my macbook and linux box. Each nodes talks to ZooKeeper to learn about the other nodes, including their addresses. Recovery is then done node to node using the appropriate addresses. - Mark Miller lucidimagination.com On Mar 16, 2012, at 3:00 PM, Matthew Parker wrote: I'm still having issues replicating in my work environment. Can anyone explain how the replication mechanism works? Is it communicating across ports or through zookeeper to manager the process? On Thu, Mar 8, 2012 at 10:57 PM, Matthew Parker mpar...@apogeeintegration.com wrote: All, I recreated the cluster on my machine at home (Windows 7, Java 1.6.0.23, apache-solr-4.0-2012-02-29_09-07-30) , sent some document through Manifold using its crawler, and it looks like it's replicating fine once the documents are committed. This must be related to my environment somehow. Thanks for your help. Regards, Matt On Fri, Mar 2, 2012 at 9:06 AM, Erick Erickson erickerick...@gmail.comwrote: Matt: Just for paranoia's sake, when I was playing around with this (the _version_ thing was one of my problems too) I removed the entire data directory as well as the zoo_data directory between experiments (and recreated just the data dir). This included various index.2012 files and the tlog directory on the theory that *maybe* there was some confusion happening on startup with an already-wonky index. If you have the energy and tried that it might be helpful information, but it may also be a total red-herring FWIW Erick On Thu, Mar 1, 2012 at 8:28 PM, Mark Miller markrmil...@gmail.com wrote: I assuming the windows configuration looked correct? Yeah, so far I can not spot any smoking gun...I'm confounded at the moment. I'll re read through everything once more... - Mark -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration. - Mark Miller lucidimagination.com
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
The cluster is running on one machine. On Sun, Mar 18, 2012 at 2:07 PM, Mark Miller markrmil...@gmail.com wrote: From every node in your cluster you can hit http://MACHINE1:8084/solr in your browser and get a response? On Mar 18, 2012, at 1:46 PM, Matthew Parker wrote: My cloud instance finally tried to sync. It looks like it's having connection issues, but I can bring the SOLR instance up in the browser so I'm not sure why it cannot connect to it. I got the following condensed log output: org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect Retrying request shard update error StdNode: http://MACHINE1:8084/solr/:org.apache.solr.client.solrj.SolrServerException: http://MACHINE1:8084/solr at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java: 483) .. .. .. Caused by: java.net.ConnectException: Connection refused: connect at java.net.DualStackPlainSocketImpl.connect0(Native Method) .. .. .. try and ask http://MACHINE1:8084/solr to recover Could not tell a replica to recover org.apache.solr.client.solrj.SolrServerException: http://MACHINE1:8084/solr at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:483) ... ... ... Caused by: java.net.ConnectException: Connection refused: connect at java.net.DualStackPlainSocketImpl.waitForConnect(Native method) .. .. .. On Sat, Mar 17, 2012 at 10:10 PM, Mark Miller markrmil...@gmail.com wrote: Nodes talk to ZooKeeper as well as to each other. You can see the addresses they are trying to use to communicate with each other in the 'cloud' view of the Solr Admin UI. Sometimes you have to override these, as the detected default may not be an address that other nodes can reach. As a limited example: for some reason my mac cannot talk to my linux box with its default detected host address of halfmetal:8983/solr - but the mac can reach my linux box if I use halfmetal.Local - so I have to override the published address of my linux box using the host attribute if I want to setup a cluster between my macbook and linux box. Each nodes talks to ZooKeeper to learn about the other nodes, including their addresses. Recovery is then done node to node using the appropriate addresses. - Mark Miller lucidimagination.com On Mar 16, 2012, at 3:00 PM, Matthew Parker wrote: I'm still having issues replicating in my work environment. Can anyone explain how the replication mechanism works? Is it communicating across ports or through zookeeper to manager the process? On Thu, Mar 8, 2012 at 10:57 PM, Matthew Parker mpar...@apogeeintegration.com wrote: All, I recreated the cluster on my machine at home (Windows 7, Java 1.6.0.23, apache-solr-4.0-2012-02-29_09-07-30) , sent some document through Manifold using its crawler, and it looks like it's replicating fine once the documents are committed. This must be related to my environment somehow. Thanks for your help. Regards, Matt On Fri, Mar 2, 2012 at 9:06 AM, Erick Erickson erickerick...@gmail.comwrote: Matt: Just for paranoia's sake, when I was playing around with this (the _version_ thing was one of my problems too) I removed the entire data directory as well as the zoo_data directory between experiments (and recreated just the data dir). This included various index.2012 files and the tlog directory on the theory that *maybe* there was some confusion happening on startup with an already-wonky index. If you have the energy and tried that it might be helpful information, but it may also be a total red-herring FWIW Erick On Thu, Mar 1, 2012 at 8:28 PM, Mark Miller markrmil...@gmail.com wrote: I assuming the windows configuration looked correct? Yeah, so far I can not spot any smoking gun...I'm confounded at the moment. I'll re read through everything once more... - Mark -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration. - Mark Miller lucidimagination.com -- This e-mail and any files transmitted with it may
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
I think he's asking if all the nodes (same machine or not) return a response. Presumably you have different ports for each node since they are on the same machine. On Sun, 2012-03-18 at 14:44 -0400, Matthew Parker wrote: The cluster is running on one machine. On Sun, Mar 18, 2012 at 2:07 PM, Mark Miller markrmil...@gmail.com wrote: From every node in your cluster you can hit http://MACHINE1:8084/solr in your browser and get a response? On Mar 18, 2012, at 1:46 PM, Matthew Parker wrote: My cloud instance finally tried to sync. It looks like it's having connection issues, but I can bring the SOLR instance up in the browser so I'm not sure why it cannot connect to it. I got the following condensed log output: org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect Retrying request shard update error StdNode: http://MACHINE1:8084/solr/:org.apache.solr.client.solrj.SolrServerException: http://MACHINE1:8084/solr at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java: 483) .. .. .. Caused by: java.net.ConnectException: Connection refused: connect at java.net.DualStackPlainSocketImpl.connect0(Native Method) .. .. .. try and ask http://MACHINE1:8084/solr to recover Could not tell a replica to recover org.apache.solr.client.solrj.SolrServerException: http://MACHINE1:8084/solr at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:483) ... ... ... Caused by: java.net.ConnectException: Connection refused: connect at java.net.DualStackPlainSocketImpl.waitForConnect(Native method) .. .. .. On Sat, Mar 17, 2012 at 10:10 PM, Mark Miller markrmil...@gmail.com wrote: Nodes talk to ZooKeeper as well as to each other. You can see the addresses they are trying to use to communicate with each other in the 'cloud' view of the Solr Admin UI. Sometimes you have to override these, as the detected default may not be an address that other nodes can reach. As a limited example: for some reason my mac cannot talk to my linux box with its default detected host address of halfmetal:8983/solr - but the mac can reach my linux box if I use halfmetal.Local - so I have to override the published address of my linux box using the host attribute if I want to setup a cluster between my macbook and linux box. Each nodes talks to ZooKeeper to learn about the other nodes, including their addresses. Recovery is then done node to node using the appropriate addresses. - Mark Miller lucidimagination.com On Mar 16, 2012, at 3:00 PM, Matthew Parker wrote: I'm still having issues replicating in my work environment. Can anyone explain how the replication mechanism works? Is it communicating across ports or through zookeeper to manager the process? On Thu, Mar 8, 2012 at 10:57 PM, Matthew Parker mpar...@apogeeintegration.com wrote: All, I recreated the cluster on my machine at home (Windows 7, Java 1.6.0.23, apache-solr-4.0-2012-02-29_09-07-30) , sent some document through Manifold using its crawler, and it looks like it's replicating fine once the documents are committed. This must be related to my environment somehow. Thanks for your help. Regards, Matt On Fri, Mar 2, 2012 at 9:06 AM, Erick Erickson erickerick...@gmail.comwrote: Matt: Just for paranoia's sake, when I was playing around with this (the _version_ thing was one of my problems too) I removed the entire data directory as well as the zoo_data directory between experiments (and recreated just the data dir). This included various index.2012 files and the tlog directory on the theory that *maybe* there was some confusion happening on startup with an already-wonky index. If you have the energy and tried that it might be helpful information, but it may also be a total red-herring FWIW Erick On Thu, Mar 1, 2012 at 8:28 PM, Mark Miller markrmil...@gmail.com wrote: I assuming the windows configuration looked correct? Yeah, so far I can not spot any smoking gun...I'm confounded at the moment. I'll re read through everything once more... - Mark
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
I have nodes running on ports: 8081-8084 A couple of the other SOLR cloud nodes we complaining about not being talk with 8081, which is the first node brought up in the cluster. The startup process is: 1. start 3 zookeeper nodes 2. wait until complete 3. start first solr node. 4. wait until complete 5. start remaining 3 solr nodes. I wiped the zookeper and solr nodes data directories to start fresh. Another question: Would a Tika Exception cause the nodes not to replicate? I can see the documents being commited on the first solr node, but nothing replicates to the other 3. On Sun, Mar 18, 2012 at 2:07 PM, Mark Miller markrmil...@gmail.com wrote: From every node in your cluster you can hit http://MACHINE1:8084/solr in your browser and get a response? On Mar 18, 2012, at 1:46 PM, Matthew Parker wrote: My cloud instance finally tried to sync. It looks like it's having connection issues, but I can bring the SOLR instance up in the browser so I'm not sure why it cannot connect to it. I got the following condensed log output: org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect Retrying request shard update error StdNode: http://MACHINE1:8084/solr/:org.apache.solr.client.solrj.SolrServerException: http://MACHINE1:8084/solr at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java: 483) .. .. .. Caused by: java.net.ConnectException: Connection refused: connect at java.net.DualStackPlainSocketImpl.connect0(Native Method) .. .. .. try and ask http://MACHINE1:8084/solr to recover Could not tell a replica to recover org.apache.solr.client.solrj.SolrServerException: http://MACHINE1:8084/solr at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:483) ... ... ... Caused by: java.net.ConnectException: Connection refused: connect at java.net.DualStackPlainSocketImpl.waitForConnect(Native method) .. .. .. On Sat, Mar 17, 2012 at 10:10 PM, Mark Miller markrmil...@gmail.com wrote: Nodes talk to ZooKeeper as well as to each other. You can see the addresses they are trying to use to communicate with each other in the 'cloud' view of the Solr Admin UI. Sometimes you have to override these, as the detected default may not be an address that other nodes can reach. As a limited example: for some reason my mac cannot talk to my linux box with its default detected host address of halfmetal:8983/solr - but the mac can reach my linux box if I use halfmetal.Local - so I have to override the published address of my linux box using the host attribute if I want to setup a cluster between my macbook and linux box. Each nodes talks to ZooKeeper to learn about the other nodes, including their addresses. Recovery is then done node to node using the appropriate addresses. - Mark Miller lucidimagination.com On Mar 16, 2012, at 3:00 PM, Matthew Parker wrote: I'm still having issues replicating in my work environment. Can anyone explain how the replication mechanism works? Is it communicating across ports or through zookeeper to manager the process? On Thu, Mar 8, 2012 at 10:57 PM, Matthew Parker mpar...@apogeeintegration.com wrote: All, I recreated the cluster on my machine at home (Windows 7, Java 1.6.0.23, apache-solr-4.0-2012-02-29_09-07-30) , sent some document through Manifold using its crawler, and it looks like it's replicating fine once the documents are committed. This must be related to my environment somehow. Thanks for your help. Regards, Matt On Fri, Mar 2, 2012 at 9:06 AM, Erick Erickson erickerick...@gmail.comwrote: Matt: Just for paranoia's sake, when I was playing around with this (the _version_ thing was one of my problems too) I removed the entire data directory as well as the zoo_data directory between experiments (and recreated just the data dir). This included various index.2012 files and the tlog directory on the theory that *maybe* there was some confusion happening on startup with an already-wonky index. If you have the energy and tried that it might be helpful information, but it may also be a total red-herring FWIW Erick On Thu, Mar 1, 2012 at 8:28 PM, Mark Miller markrmil...@gmail.com wrote: I assuming the windows configuration looked correct? Yeah, so far I can
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
I had tried importing data from Manifold, and one document threw a Tika Exception. If I shut everything down and restart SOLR cloud, the system sync'd on startup. Could extraction errors be the issue? On Sun, Mar 18, 2012 at 2:50 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I have nodes running on ports: 8081-8084 A couple of the other SOLR cloud nodes we complaining about not being talk with 8081, which is the first node brought up in the cluster. The startup process is: 1. start 3 zookeeper nodes 2. wait until complete 3. start first solr node. 4. wait until complete 5. start remaining 3 solr nodes. I wiped the zookeper and solr nodes data directories to start fresh. Another question: Would a Tika Exception cause the nodes not to replicate? I can see the documents being commited on the first solr node, but nothing replicates to the other 3. On Sun, Mar 18, 2012 at 2:07 PM, Mark Miller markrmil...@gmail.comwrote: From every node in your cluster you can hit http://MACHINE1:8084/solr in your browser and get a response? On Mar 18, 2012, at 1:46 PM, Matthew Parker wrote: My cloud instance finally tried to sync. It looks like it's having connection issues, but I can bring the SOLR instance up in the browser so I'm not sure why it cannot connect to it. I got the following condensed log output: org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect Retrying request shard update error StdNode: http://MACHINE1:8084/solr/:org.apache.solr.client.solrj.SolrServerException: http://MACHINE1:8084/solr at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java: 483) .. .. .. Caused by: java.net.ConnectException: Connection refused: connect at java.net.DualStackPlainSocketImpl.connect0(Native Method) .. .. .. try and ask http://MACHINE1:8084/solr to recover Could not tell a replica to recover org.apache.solr.client.solrj.SolrServerException: http://MACHINE1:8084/solr at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:483) ... ... ... Caused by: java.net.ConnectException: Connection refused: connect at java.net.DualStackPlainSocketImpl.waitForConnect(Native method) .. .. .. On Sat, Mar 17, 2012 at 10:10 PM, Mark Miller markrmil...@gmail.com wrote: Nodes talk to ZooKeeper as well as to each other. You can see the addresses they are trying to use to communicate with each other in the 'cloud' view of the Solr Admin UI. Sometimes you have to override these, as the detected default may not be an address that other nodes can reach. As a limited example: for some reason my mac cannot talk to my linux box with its default detected host address of halfmetal:8983/solr - but the mac can reach my linux box if I use halfmetal.Local - so I have to override the published address of my linux box using the host attribute if I want to setup a cluster between my macbook and linux box. Each nodes talks to ZooKeeper to learn about the other nodes, including their addresses. Recovery is then done node to node using the appropriate addresses. - Mark Miller lucidimagination.com On Mar 16, 2012, at 3:00 PM, Matthew Parker wrote: I'm still having issues replicating in my work environment. Can anyone explain how the replication mechanism works? Is it communicating across ports or through zookeeper to manager the process? On Thu, Mar 8, 2012 at 10:57 PM, Matthew Parker mpar...@apogeeintegration.com wrote: All, I recreated the cluster on my machine at home (Windows 7, Java 1.6.0.23, apache-solr-4.0-2012-02-29_09-07-30) , sent some document through Manifold using its crawler, and it looks like it's replicating fine once the documents are committed. This must be related to my environment somehow. Thanks for your help. Regards, Matt On Fri, Mar 2, 2012 at 9:06 AM, Erick Erickson erickerick...@gmail.comwrote: Matt: Just for paranoia's sake, when I was playing around with this (the _version_ thing was one of my problems too) I removed the entire data directory as well as the zoo_data directory between experiments (and recreated just the data dir). This included various index.2012 files and the tlog directory on the theory that *maybe* there was some confusion happening on startup with an already-wonky index.
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
That idea was short lived. I excluded the document. The cluster isn't syncing even after shutting everything down and restarting. On Sun, Mar 18, 2012 at 2:58 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I had tried importing data from Manifold, and one document threw a Tika Exception. If I shut everything down and restart SOLR cloud, the system sync'd on startup. Could extraction errors be the issue? On Sun, Mar 18, 2012 at 2:50 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I have nodes running on ports: 8081-8084 A couple of the other SOLR cloud nodes we complaining about not being talk with 8081, which is the first node brought up in the cluster. The startup process is: 1. start 3 zookeeper nodes 2. wait until complete 3. start first solr node. 4. wait until complete 5. start remaining 3 solr nodes. I wiped the zookeper and solr nodes data directories to start fresh. Another question: Would a Tika Exception cause the nodes not to replicate? I can see the documents being commited on the first solr node, but nothing replicates to the other 3. On Sun, Mar 18, 2012 at 2:07 PM, Mark Miller markrmil...@gmail.comwrote: From every node in your cluster you can hit http://MACHINE1:8084/solrin your browser and get a response? On Mar 18, 2012, at 1:46 PM, Matthew Parker wrote: My cloud instance finally tried to sync. It looks like it's having connection issues, but I can bring the SOLR instance up in the browser so I'm not sure why it cannot connect to it. I got the following condensed log output: org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect org.apache.commons.httpclient.HttpMethodDirector executeWithRetry I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect Retrying request shard update error StdNode: http://MACHINE1:8084/solr/:org.apache.solr.client.solrj.SolrServerException: http://MACHINE1:8084/solr at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java: 483) .. .. .. Caused by: java.net.ConnectException: Connection refused: connect at java.net.DualStackPlainSocketImpl.connect0(Native Method) .. .. .. try and ask http://MACHINE1:8084/solr to recover Could not tell a replica to recover org.apache.solr.client.solrj.SolrServerException: http://MACHINE1:8084/solr at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:483) ... ... ... Caused by: java.net.ConnectException: Connection refused: connect at java.net.DualStackPlainSocketImpl.waitForConnect(Native method) .. .. .. On Sat, Mar 17, 2012 at 10:10 PM, Mark Miller markrmil...@gmail.com wrote: Nodes talk to ZooKeeper as well as to each other. You can see the addresses they are trying to use to communicate with each other in the 'cloud' view of the Solr Admin UI. Sometimes you have to override these, as the detected default may not be an address that other nodes can reach. As a limited example: for some reason my mac cannot talk to my linux box with its default detected host address of halfmetal:8983/solr - but the mac can reach my linux box if I use halfmetal.Local - so I have to override the published address of my linux box using the host attribute if I want to setup a cluster between my macbook and linux box. Each nodes talks to ZooKeeper to learn about the other nodes, including their addresses. Recovery is then done node to node using the appropriate addresses. - Mark Miller lucidimagination.com On Mar 16, 2012, at 3:00 PM, Matthew Parker wrote: I'm still having issues replicating in my work environment. Can anyone explain how the replication mechanism works? Is it communicating across ports or through zookeeper to manager the process? On Thu, Mar 8, 2012 at 10:57 PM, Matthew Parker mpar...@apogeeintegration.com wrote: All, I recreated the cluster on my machine at home (Windows 7, Java 1.6.0.23, apache-solr-4.0-2012-02-29_09-07-30) , sent some document through Manifold using its crawler, and it looks like it's replicating fine once the documents are committed. This must be related to my environment somehow. Thanks for your help. Regards, Matt On Fri, Mar 2, 2012 at 9:06 AM, Erick Erickson erickerick...@gmail.comwrote: Matt: Just for paranoia's sake, when I was playing around with this (the _version_ thing was one of my problems too) I removed the entire data directory as well as the zoo_data directory between
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
Nodes talk to ZooKeeper as well as to each other. You can see the addresses they are trying to use to communicate with each other in the 'cloud' view of the Solr Admin UI. Sometimes you have to override these, as the detected default may not be an address that other nodes can reach. As a limited example: for some reason my mac cannot talk to my linux box with its default detected host address of halfmetal:8983/solr - but the mac can reach my linux box if I use halfmetal.Local - so I have to override the published address of my linux box using the host attribute if I want to setup a cluster between my macbook and linux box. Each nodes talks to ZooKeeper to learn about the other nodes, including their addresses. Recovery is then done node to node using the appropriate addresses. - Mark Miller lucidimagination.com On Mar 16, 2012, at 3:00 PM, Matthew Parker wrote: I'm still having issues replicating in my work environment. Can anyone explain how the replication mechanism works? Is it communicating across ports or through zookeeper to manager the process? On Thu, Mar 8, 2012 at 10:57 PM, Matthew Parker mpar...@apogeeintegration.com wrote: All, I recreated the cluster on my machine at home (Windows 7, Java 1.6.0.23, apache-solr-4.0-2012-02-29_09-07-30) , sent some document through Manifold using its crawler, and it looks like it's replicating fine once the documents are committed. This must be related to my environment somehow. Thanks for your help. Regards, Matt On Fri, Mar 2, 2012 at 9:06 AM, Erick Erickson erickerick...@gmail.comwrote: Matt: Just for paranoia's sake, when I was playing around with this (the _version_ thing was one of my problems too) I removed the entire data directory as well as the zoo_data directory between experiments (and recreated just the data dir). This included various index.2012 files and the tlog directory on the theory that *maybe* there was some confusion happening on startup with an already-wonky index. If you have the energy and tried that it might be helpful information, but it may also be a total red-herring FWIW Erick On Thu, Mar 1, 2012 at 8:28 PM, Mark Miller markrmil...@gmail.com wrote: I assuming the windows configuration looked correct? Yeah, so far I can not spot any smoking gun...I'm confounded at the moment. I'll re read through everything once more... - Mark
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
I'm still having issues replicating in my work environment. Can anyone explain how the replication mechanism works? Is it communicating across ports or through zookeeper to manager the process? On Thu, Mar 8, 2012 at 10:57 PM, Matthew Parker mpar...@apogeeintegration.com wrote: All, I recreated the cluster on my machine at home (Windows 7, Java 1.6.0.23, apache-solr-4.0-2012-02-29_09-07-30) , sent some document through Manifold using its crawler, and it looks like it's replicating fine once the documents are committed. This must be related to my environment somehow. Thanks for your help. Regards, Matt On Fri, Mar 2, 2012 at 9:06 AM, Erick Erickson erickerick...@gmail.comwrote: Matt: Just for paranoia's sake, when I was playing around with this (the _version_ thing was one of my problems too) I removed the entire data directory as well as the zoo_data directory between experiments (and recreated just the data dir). This included various index.2012 files and the tlog directory on the theory that *maybe* there was some confusion happening on startup with an already-wonky index. If you have the energy and tried that it might be helpful information, but it may also be a total red-herring FWIW Erick On Thu, Mar 1, 2012 at 8:28 PM, Mark Miller markrmil...@gmail.com wrote: I assuming the windows configuration looked correct? Yeah, so far I can not spot any smoking gun...I'm confounded at the moment. I'll re read through everything once more... - Mark
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
All, I recreated the cluster on my machine at home (Windows 7, Java 1.6.0.23, apache-solr-4.0-2012-02-29_09-07-30) , sent some document through Manifold using its crawler, and it looks like it's replicating fine once the documents are committed. This must be related to my environment somehow. Thanks for your help. Regards, Matt On Fri, Mar 2, 2012 at 9:06 AM, Erick Erickson erickerick...@gmail.comwrote: Matt: Just for paranoia's sake, when I was playing around with this (the _version_ thing was one of my problems too) I removed the entire data directory as well as the zoo_data directory between experiments (and recreated just the data dir). This included various index.2012 files and the tlog directory on the theory that *maybe* there was some confusion happening on startup with an already-wonky index. If you have the energy and tried that it might be helpful information, but it may also be a total red-herring FWIW Erick On Thu, Mar 1, 2012 at 8:28 PM, Mark Miller markrmil...@gmail.com wrote: I assuming the windows configuration looked correct? Yeah, so far I can not spot any smoking gun...I'm confounded at the moment. I'll re read through everything once more... - Mark -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration.
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
Matt: Just for paranoia's sake, when I was playing around with this (the _version_ thing was one of my problems too) I removed the entire data directory as well as the zoo_data directory between experiments (and recreated just the data dir). This included various index.2012 files and the tlog directory on the theory that *maybe* there was some confusion happening on startup with an already-wonky index. If you have the energy and tried that it might be helpful information, but it may also be a total red-herring FWIW Erick On Thu, Mar 1, 2012 at 8:28 PM, Mark Miller markrmil...@gmail.com wrote: I assuming the windows configuration looked correct? Yeah, so far I can not spot any smoking gun...I'm confounded at the moment. I'll re read through everything once more... - Mark
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
I've ensured the SOLR data subdirectories and files were completed cleaned out, but the issue still occurs. On Fri, Mar 2, 2012 at 9:06 AM, Erick Erickson erickerick...@gmail.comwrote: Matt: Just for paranoia's sake, when I was playing around with this (the _version_ thing was one of my problems too) I removed the entire data directory as well as the zoo_data directory between experiments (and recreated just the data dir). This included various index.2012 files and the tlog directory on the theory that *maybe* there was some confusion happening on startup with an already-wonky index. If you have the energy and tried that it might be helpful information, but it may also be a total red-herring FWIW Erick On Thu, Mar 1, 2012 at 8:28 PM, Mark Miller markrmil...@gmail.com wrote: I assuming the windows configuration looked correct? Yeah, so far I can not spot any smoking gun...I'm confounded at the moment. I'll re read through everything once more... - Mark -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration.
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
commands to start the other 3 instances from their home directories: java -Djetty.port=8082 -Dhostport=8082 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar java -Djetty.port=8083 -Dhostport=8083 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar java -Djetty.port=8084 -Dhostport=8084 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar All startup without issue. Step 6 - Modified solrconfig.xml to have a custom request handler === requestHandler name=/update/sharepoint startup=lazy class=solr.extraction.ExtractingRequestHandler lst name=defaults str name=update.chainsharepoint-pipeline/str str name=fmap.contenttext/str str name=lowernamestrue/str str name=uprefixignored/str str name=caputreAttrtrue/str str name=fmap.alinks/str str name=fmap.divignored/str /lst /requestHandler updateRequestProcessorChain name=sharepoint-pipeline processor class=solr.processor.SignatureUpdateProcessorFactory bool name=enabledtrue/bool str name=signatureFieldid/str bool name=owerrightDupestrue/bool str name=fieldsurl/str str name=signatureClasssolr.processor.Lookup3Signature/str /processor processor class=solr.LogUpdateProcessorFactory/ processor class=solr.RunUpdateProcessorFactory/ /updateRequestProcessorChain Hopefully this will shed some light on why my configuration is having issues. Thanks for your help. Matt On Tue, Feb 28, 2012 at 8:29 PM, Mark Miller markrmil...@gmail.com wrote: Hmm...this is very strange - there is nothing interesting in any of the logs? In clusterstate.json, all of the shards have an active state? There are quite a few of us doing exactly this setup recently, so there must be something we are missing here... Any info you can offer might help. - Mark On Feb 28, 2012, at 1:00 PM, Matthew Parker wrote: Mark, I got the codebase from the 2/26/2012, and I got the same inconsistent results. I have solr running on four ports 8081-8084 8081 and 8082 are the leaders for shard 1, and shard 2, respectively 8083 - is assigned to shard 1 8084 - is assigned to shard 2 queries come in and sometime it seems the windows from 8081 and 8083 move responding to the query but there are no results. if the queries run on 8081/8082 or 8081/8084 then results come back ok. The query is nothing more than: q=*:* Regards, Matt On Mon, Feb 27, 2012 at 9:26 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I'll have to check on the commit situation. We have been pushing data from SharePoint the last week or so. Would that somehow block the documents moving between the solr instances? I'll try another version tomorrow. Thanks for the suggestions. On Mon, Feb 27, 2012 at 5:34 PM, Mark Miller markrmil...@gmail.com wrote: Hmmm...all of that looks pretty normal... Did a commit somehow fail on the other machine? When you view the stats for the update handler, are there a lot of pending adds for on of the nodes? Do the commit counts match across nodes? You can also query an individual node with distrib=false to check that. If you build is a month old, I'd honestly recommend you try upgrading as well. - Mark On Feb 27, 2012, at 3:34 PM, Matthew Parker wrote: Here is most of the cluster state: Connected to Zookeeper localhost:2181, localhost: 2182, localhost:2183 /(v=0 children=7) /CONFIGS(v=0, children=1) /CONFIGURATION(v=0 children=25) all the configuration files, velocity info, xslt, etc. /NODE_STATES(v=0 children=4) MACHINE1:8083_SOLR (v=121)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8082_SOLR (v=101)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... MACHINE1:8081_SOLR (v=92)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8084_SOLR (v=73)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... /ZOOKEEPER (v-0 children=1) QUOTA(v=0) /CLUSTERSTATE.JSON(V=272){collection1:{shard1:{MACHINE1:8081_solr_:{shard_id:shard1,leader:true,... /LIVE_NODES (v=0 children=4) MACHINE1:8083_SOLR(ephemeral v=0) MACHINE1:8082_SOLR(ephemeral v=0) MACHINE1:8081_SOLR(ephemeral v=0) MACHINE1:8084_SOLR(ephemeral v=0) /COLLECTIONS (v=1 children=1) COLLECTION1(v=0 children=2){configName:configuration1} LEADER_ELECT(v=0
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
the following commands to start the other 3 instances from their home directories: java -Djetty.port=8082 -Dhostport=8082 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar java -Djetty.port=8083 -Dhostport=8083 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar java -Djetty.port=8084 -Dhostport=8084 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar All startup without issue. Step 6 - Modified solrconfig.xml to have a custom request handler === requestHandler name=/update/sharepoint startup=lazy class=solr.extraction.ExtractingRequestHandler lst name=defaults str name=update.chainsharepoint-pipeline/str str name=fmap.contenttext/str str name=lowernamestrue/str str name=uprefixignored/str str name=caputreAttrtrue/str str name=fmap.alinks/str str name=fmap.divignored/str /lst /requestHandler updateRequestProcessorChain name=sharepoint-pipeline processor class=solr.processor.SignatureUpdateProcessorFactory bool name=enabledtrue/bool str name=signatureFieldid/str bool name=owerrightDupestrue/bool str name=fieldsurl/str str name=signatureClasssolr.processor.Lookup3Signature/str /processor processor class=solr.LogUpdateProcessorFactory/ processor class=solr.RunUpdateProcessorFactory/ /updateRequestProcessorChain Hopefully this will shed some light on why my configuration is having issues. Thanks for your help. Matt On Tue, Feb 28, 2012 at 8:29 PM, Mark Miller markrmil...@gmail.com wrote: Hmm...this is very strange - there is nothing interesting in any of the logs? In clusterstate.json, all of the shards have an active state? There are quite a few of us doing exactly this setup recently, so there must be something we are missing here... Any info you can offer might help. - Mark On Feb 28, 2012, at 1:00 PM, Matthew Parker wrote: Mark, I got the codebase from the 2/26/2012, and I got the same inconsistent results. I have solr running on four ports 8081-8084 8081 and 8082 are the leaders for shard 1, and shard 2, respectively 8083 - is assigned to shard 1 8084 - is assigned to shard 2 queries come in and sometime it seems the windows from 8081 and 8083 move responding to the query but there are no results. if the queries run on 8081/8082 or 8081/8084 then results come back ok. The query is nothing more than: q=*:* Regards, Matt On Mon, Feb 27, 2012 at 9:26 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I'll have to check on the commit situation. We have been pushing data from SharePoint the last week or so. Would that somehow block the documents moving between the solr instances? I'll try another version tomorrow. Thanks for the suggestions. On Mon, Feb 27, 2012 at 5:34 PM, Mark Miller markrmil...@gmail.com wrote: Hmmm...all of that looks pretty normal... Did a commit somehow fail on the other machine? When you view the stats for the update handler, are there a lot of pending adds for on of the nodes? Do the commit counts match across nodes? You can also query an individual node with distrib=false to check that. If you build is a month old, I'd honestly recommend you try upgrading as well. - Mark On Feb 27, 2012, at 3:34 PM, Matthew Parker wrote: Here is most of the cluster state: Connected to Zookeeper localhost:2181, localhost: 2182, localhost:2183 /(v=0 children=7) /CONFIGS(v=0, children=1) /CONFIGURATION(v=0 children=25) all the configuration files, velocity info, xslt, etc. /NODE_STATES(v=0 children=4) MACHINE1:8083_SOLR (v=121)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8082_SOLR (v=101)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... MACHINE1:8081_SOLR (v=92)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8084_SOLR (v=73)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... /ZOOKEEPER (v-0 children=1) QUOTA(v=0) /CLUSTERSTATE.JSON(V=272){collection1:{shard1:{MACHINE1:8081_solr_:{shard_id:shard1,leader:true,... /LIVE_NODES (v=0 children=4) MACHINE1:8083_SOLR(ephemeral v=0) MACHINE1:8082_SOLR(ephemeral v=0) MACHINE1:8081_SOLR(ephemeral v=0) MACHINE1:8084_SOLR(ephemeral v=0) /COLLECTIONS (v=1 children=1) COLLECTION1(v=0 children=2){configName:configuration1} LEADER_ELECT(v=0 children=2) SHARD1(V=0 children=1) ELECTION(v=0 children=2) 87186203314552835-MACHINE1:8081_SOLR_-N_96(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_84(ephemeral v=0) SHARD2(v=0 children=1) ELECTION(v=0 children=2) 231301391392833539-MACHINE1:8084_SOLR_-N_85(ephemeral v=0) 159243797356740611
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
\zkServer.cmd start .\zookeeper2\bin\zkServer.cmd start .\zookeeper3\bin\zkServer.cmd Step 4 - Start Main SOLR instance == I ran the following command to start the main SOLR instance java -Djetty.port=8081 -Dhostport=8081 -Dbootstrap_configdir=[DATA_DIRECTORY]/solr/conf -Dnumshards=2 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar Starts up fine. Step 5 - Start the Remaining 3 SOLR Instances == I ran the following commands to start the other 3 instances from their home directories: java -Djetty.port=8082 -Dhostport=8082 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar java -Djetty.port=8083 -Dhostport=8083 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar java -Djetty.port=8084 -Dhostport=8084 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar All startup without issue. Step 6 - Modified solrconfig.xml to have a custom request handler === requestHandler name=/update/sharepoint startup=lazy class=solr.extraction.ExtractingRequestHandler lst name=defaults str name=update.chainsharepoint-pipeline/str str name=fmap.contenttext/str str name=lowernamestrue/str str name=uprefixignored/str str name=caputreAttrtrue/str str name=fmap.alinks/str str name=fmap.divignored/str /lst /requestHandler updateRequestProcessorChain name=sharepoint-pipeline processor class=solr.processor.SignatureUpdateProcessorFactory bool name=enabledtrue/bool str name=signatureFieldid/str bool name=owerrightDupestrue/bool str name=fieldsurl/str str name=signatureClasssolr.processor.Lookup3Signature/str /processor processor class=solr.LogUpdateProcessorFactory/ processor class=solr.RunUpdateProcessorFactory/ /updateRequestProcessorChain Hopefully this will shed some light on why my configuration is having issues. Thanks for your help. Matt On Tue, Feb 28, 2012 at 8:29 PM, Mark Miller markrmil...@gmail.com wrote: Hmm...this is very strange - there is nothing interesting in any of the logs? In clusterstate.json, all of the shards have an active state? There are quite a few of us doing exactly this setup recently, so there must be something we are missing here... Any info you can offer might help. - Mark On Feb 28, 2012, at 1:00 PM, Matthew Parker wrote: Mark, I got the codebase from the 2/26/2012, and I got the same inconsistent results. I have solr running on four ports 8081-8084 8081 and 8082 are the leaders for shard 1, and shard 2, respectively 8083 - is assigned to shard 1 8084 - is assigned to shard 2 queries come in and sometime it seems the windows from 8081 and 8083 move responding to the query but there are no results. if the queries run on 8081/8082 or 8081/8084 then results come back ok. The query is nothing more than: q=*:* Regards, Matt On Mon, Feb 27, 2012 at 9:26 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I'll have to check on the commit situation. We have been pushing data from SharePoint the last week or so. Would that somehow block the documents moving between the solr instances? I'll try another version tomorrow. Thanks for the suggestions. On Mon, Feb 27, 2012 at 5:34 PM, Mark Miller markrmil...@gmail.com wrote: Hmmm...all of that looks pretty normal... Did a commit somehow fail on the other machine? When you view the stats for the update handler, are there a lot of pending adds for on of the nodes? Do the commit counts match across nodes? You can also query an individual node with distrib=false to check that. If you build is a month old, I'd honestly recommend you try upgrading as well. - Mark On Feb 27, 2012, at 3:34 PM, Matthew Parker wrote: Here is most of the cluster state: Connected to Zookeeper localhost:2181, localhost: 2182, localhost:2183 /(v=0 children=7) /CONFIGS(v=0, children=1) /CONFIGURATION(v=0 children=25) all the configuration files, velocity info, xslt, etc. /NODE_STATES(v=0 children=4) MACHINE1:8083_SOLR (v=121)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8082_SOLR (v=101)[{shard_id:shard2, state:active,core:,collection:collection1,node_name
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
the following command to start the main SOLR instance java -Djetty.port=8081 -Dhostport=8081 -Dbootstrap_configdir=[DATA_DIRECTORY]/solr/conf -Dnumshards=2 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar Starts up fine. Step 5 - Start the Remaining 3 SOLR Instances == I ran the following commands to start the other 3 instances from their home directories: java -Djetty.port=8082 -Dhostport=8082 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar java -Djetty.port=8083 -Dhostport=8083 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar java -Djetty.port=8084 -Dhostport=8084 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar All startup without issue. Step 6 - Modified solrconfig.xml to have a custom request handler === requestHandler name=/update/sharepoint startup=lazy class=solr.extraction.ExtractingRequestHandler lst name=defaults str name=update.chainsharepoint-pipeline/str str name=fmap.contenttext/str str name=lowernamestrue/str str name=uprefixignored/str str name=caputreAttrtrue/str str name=fmap.alinks/str str name=fmap.divignored/str /lst /requestHandler updateRequestProcessorChain name=sharepoint-pipeline processor class=solr.processor.SignatureUpdateProcessorFactory bool name=enabledtrue/bool str name=signatureFieldid/str bool name=owerrightDupestrue/bool str name=fieldsurl/str str name=signatureClasssolr.processor.Lookup3Signature/str /processor processor class=solr.LogUpdateProcessorFactory/ processor class=solr.RunUpdateProcessorFactory/ /updateRequestProcessorChain Hopefully this will shed some light on why my configuration is having issues. Thanks for your help. Matt On Tue, Feb 28, 2012 at 8:29 PM, Mark Miller markrmil...@gmail.com wrote: Hmm...this is very strange - there is nothing interesting in any of the logs? In clusterstate.json, all of the shards have an active state? There are quite a few of us doing exactly this setup recently, so there must be something we are missing here... Any info you can offer might help. - Mark On Feb 28, 2012, at 1:00 PM, Matthew Parker wrote: Mark, I got the codebase from the 2/26/2012, and I got the same inconsistent results. I have solr running on four ports 8081-8084 8081 and 8082 are the leaders for shard 1, and shard 2, respectively 8083 - is assigned to shard 1 8084 - is assigned to shard 2 queries come in and sometime it seems the windows from 8081 and 8083 move responding to the query but there are no results. if the queries run on 8081/8082 or 8081/8084 then results come back ok. The query is nothing more than: q=*:* Regards, Matt On Mon, Feb 27, 2012 at 9:26 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I'll have to check on the commit situation. We have been pushing data from SharePoint the last week or so. Would that somehow block the documents moving between the solr instances? I'll try another version tomorrow. Thanks for the suggestions. On Mon, Feb 27, 2012 at 5:34 PM, Mark Miller markrmil...@gmail.com wrote: Hmmm...all of that looks pretty normal... Did a commit somehow fail on the other machine? When you view the stats for the update handler, are there a lot of pending adds for on of the nodes? Do the commit counts match across nodes? You can also query an individual node with distrib=false to check that. If you build is a month old, I'd honestly recommend you try upgrading as well. - Mark On Feb 27, 2012, at 3:34 PM, Matthew Parker wrote: Here is most of the cluster state: Connected to Zookeeper localhost:2181, localhost: 2182, localhost:2183 /(v=0 children=7) /CONFIGS(v=0, children=1) /CONFIGURATION(v=0 children=25) all the configuration files, velocity info, xslt, etc. /NODE_STATES(v=0 children=4) MACHINE1:8083_SOLR (v=121)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8082_SOLR (v=101)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... MACHINE1:8081_SOLR (v=92)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8084_SOLR (v=73)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... /ZOOKEEPER (v-0 children=1) QUOTA(v=0) /CLUSTERSTATE.JSON(V=272){collection1:{shard1:{MACHINE1:8081_solr_:{shard_id:shard1,leader:true,... /LIVE_NODES (v=0 children=4) MACHINE1:8083_SOLR(ephemeral v=0) MACHINE1:8082_SOLR(ephemeral v=0) MACHINE1:8081_SOLR(ephemeral v=0) MACHINE1:8084_SOLR(ephemeral v=0) /COLLECTIONS (v=1 children=1
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
I assuming the windows configuration looked correct? Yeah, so far I can not spot any smoking gun...I'm confounded at the moment. I'll re read through everything once more... - Mark
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
Mark, Nothing appears to be wrong in the logs. I wiped the indexes and imported 37 files from SharePoint using Manifold. All 37 make it in, but SOLR still has issues with the results being inconsistent. Let me run my setup by you, and see whether that is the issue? On one machine, I have three zookeeper instances, four solr instances, and a data directory for solr and zookeeper config data. Step 1. I modified each zoo.xml configuration file to have: Zookeeper 1 - Create /zookeeper1/conf/zoo.cfg tickTime=2000 initLimit=10 syncLimit=5 dataDir=[DATA_DIRECTORY]/zk1_data clientPort=2181 server.1=localhost:2888:3888 server.2=localhost:2889:3889 server.3=localhost:2890:3890 Zookeeper 1 - Create /[DATA_DIRECTORY]/zk1_data/myid with the following contents: == 1 Zookeep 2 - Create /zookeeper2/conf/zoo.cfg == tickTime=2000 initLimit=10 syncLimit=5 dataDir=[DATA_DIRECTORY]/zk2_data clientPort=2182 server.1=localhost:2888:3888 server.2=localhost:2889:3889 server.3=localhost:2890:3890 Zookeeper 2 - Create /[DATA_DIRECTORY]/zk2_data/myid with the following contents: == 2 Zookeeper 3 - Create /zookeeper3/conf/zoo.cfg tickTime=2000 initLimit=10 syncLimit=5 dataDir=[DATA_DIRECTORY]/zk3_data clientPort=2183 server.1=localhost:2888:3888 server.2=localhost:2889:3889 server.3=localhost:2890:3890 Zookeeper 3 - Create /[DATA_DIRECTORY]/zk3_data/myid with the following contents: 3 Step 2 - SOLR Build === I pulled the latest SOLR trunk down. I built it with the following commands: ant example dist I modified the solr.war files and added the solr cell and extraction libraries to WEB-INF/lib. I couldn't get the extraction to work any other way. Will zookeper pickup jar files stored with the rest of the configuration files in Zookeeper? I copied the contents of the example directory to each of my SOLR directories. Step 3 - Starting Zookeeper instances === I ran the following commands to start the zookeeper instances: start .\zookeeper1\bin\zkServer.cmd start .\zookeeper2\bin\zkServer.cmd start .\zookeeper3\bin\zkServer.cmd Step 4 - Start Main SOLR instance == I ran the following command to start the main SOLR instance java -Djetty.port=8081 -Dhostport=8081 -Dbootstrap_configdir=[DATA_DIRECTORY]/solr/conf -Dnumshards=2 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar Starts up fine. Step 5 - Start the Remaining 3 SOLR Instances == I ran the following commands to start the other 3 instances from their home directories: java -Djetty.port=8082 -Dhostport=8082 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar java -Djetty.port=8083 -Dhostport=8083 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar java -Djetty.port=8084 -Dhostport=8084 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar All startup without issue. Step 6 - Modified solrconfig.xml to have a custom request handler === requestHandler name=/update/sharepoint startup=lazy class=solr.extraction.ExtractingRequestHandler lst name=defaults str name=update.chainsharepoint-pipeline/str str name=fmap.contenttext/str str name=lowernamestrue/str str name=uprefixignored/str str name=caputreAttrtrue/str str name=fmap.alinks/str str name=fmap.divignored/str /lst /requestHandler updateRequestProcessorChain name=sharepoint-pipeline processor class=solr.processor.SignatureUpdateProcessorFactory bool name=enabledtrue/bool str name=signatureFieldid/str bool name=owerrightDupestrue/bool str name=fieldsurl/str str name=signatureClasssolr.processor.Lookup3Signature/str /processor processor class=solr.LogUpdateProcessorFactory/ processor class=solr.RunUpdateProcessorFactory/ /updateRequestProcessorChain Hopefully this will shed some light on why my configuration is having issues. Thanks for your help. Matt On Tue, Feb 28, 2012 at 8:29 PM, Mark Miller markrmil...@gmail.com wrote: Hmm...this is very strange - there is nothing interesting in any of the logs? In clusterstate.json, all of the shards have an active state? There are quite a few of us doing exactly this setup recently, so there must be something we are missing here... Any info you can offer might help. - Mark On Feb 28, 2012, at 1:00 PM, Matthew Parker wrote: Mark, I got the codebase from the 2/26/2012, and I got the same inconsistent results. I have solr running on four ports 8081-8084 8081 and 8082 are the leaders for shard 1, and shard 2, respectively 8083 - is assigned to shard 1 8084 - is assigned to shard 2 queries come in and sometime it seems
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
the codebase from the 2/26/2012, and I got the same inconsistent results. I have solr running on four ports 8081-8084 8081 and 8082 are the leaders for shard 1, and shard 2, respectively 8083 - is assigned to shard 1 8084 - is assigned to shard 2 queries come in and sometime it seems the windows from 8081 and 8083 move responding to the query but there are no results. if the queries run on 8081/8082 or 8081/8084 then results come back ok. The query is nothing more than: q=*:* Regards, Matt On Mon, Feb 27, 2012 at 9:26 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I'll have to check on the commit situation. We have been pushing data from SharePoint the last week or so. Would that somehow block the documents moving between the solr instances? I'll try another version tomorrow. Thanks for the suggestions. On Mon, Feb 27, 2012 at 5:34 PM, Mark Miller markrmil...@gmail.com wrote: Hmmm...all of that looks pretty normal... Did a commit somehow fail on the other machine? When you view the stats for the update handler, are there a lot of pending adds for on of the nodes? Do the commit counts match across nodes? You can also query an individual node with distrib=false to check that. If you build is a month old, I'd honestly recommend you try upgrading as well. - Mark On Feb 27, 2012, at 3:34 PM, Matthew Parker wrote: Here is most of the cluster state: Connected to Zookeeper localhost:2181, localhost: 2182, localhost:2183 /(v=0 children=7) /CONFIGS(v=0, children=1) /CONFIGURATION(v=0 children=25) all the configuration files, velocity info, xslt, etc. /NODE_STATES(v=0 children=4) MACHINE1:8083_SOLR (v=121)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8082_SOLR (v=101)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... MACHINE1:8081_SOLR (v=92)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8084_SOLR (v=73)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... /ZOOKEEPER (v-0 children=1) QUOTA(v=0) /CLUSTERSTATE.JSON(V=272){collection1:{shard1:{MACHINE1:8081_solr_:{shard_id:shard1,leader:true,... /LIVE_NODES (v=0 children=4) MACHINE1:8083_SOLR(ephemeral v=0) MACHINE1:8082_SOLR(ephemeral v=0) MACHINE1:8081_SOLR(ephemeral v=0) MACHINE1:8084_SOLR(ephemeral v=0) /COLLECTIONS (v=1 children=1) COLLECTION1(v=0 children=2){configName:configuration1} LEADER_ELECT(v=0 children=2) SHARD1(V=0 children=1) ELECTION(v=0 children=2) 87186203314552835-MACHINE1:8081_SOLR_-N_96(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_84(ephemeral v=0) SHARD2(v=0 children=1) ELECTION(v=0 children=2) 231301391392833539-MACHINE1:8084_SOLR_-N_85(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_84(ephemeral v=0) LEADERS (v=0 children=2) SHARD1 (ephemeral v=0){core:,node_name:MACHINE1:8081_solr,base_url: http://MACHINE1:8081/solr}; SHARD2 (ephemeral v=0){core:,node_name:MACHINE1:8082_solr,base_url: http://MACHINE1:8082/solr}; /OVERSEER_ELECT (v=0 children=2) ELECTION (v=0 children=4) 231301391392833539-MACHINE1:8084_SOLR_-N_000251(ephemeral v=0) 87186203314552835-MACHINE1:8081_SOLR_-N_000248(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_000250(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_000249(ephemeral v=0) LEADER (emphemeral v=0){id:87186203314552835-MACHINE1:8081_solr-n_00248} On Mon, Feb 27, 2012 at 2:47 PM, Mark Miller markrmil...@gmail.com wrote: On Feb 27, 2012, at 2:22 PM, Matthew Parker wrote: Thanks for your reply Mark. I believe the build was towards the begining of the month. The solr.spec.version is 4.0.0.2012.01.10.38.09 I cannot access the clusterstate.json contents. I clicked on it a couple of times, but nothing happens. Is that stored on disk somewhere? Are you using the new admin UI? That has recently been updated to work better with cloud - it had some troubles not too long ago. If you are, you should trying using the old admin UI's zookeeper page - that should show the cluster state. That being said, there has been a lot of bug fixes over the past month - so you may just want to update to a recent version. I configured a custom request handler to calculate an unique document id based on the file's url. On Mon, Feb 27, 2012 at 1:13 PM, Mark Miller markrmil...@gmail.com wrote: Hey Matt - is your build recent? Can you visit the cloud/zookeeper page in the admin and send the contents of the clusterstate.json node? Are you using a custom index
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
this setup recently, so there must be something we are missing here... Any info you can offer might help. - Mark On Feb 28, 2012, at 1:00 PM, Matthew Parker wrote: Mark, I got the codebase from the 2/26/2012, and I got the same inconsistent results. I have solr running on four ports 8081-8084 8081 and 8082 are the leaders for shard 1, and shard 2, respectively 8083 - is assigned to shard 1 8084 - is assigned to shard 2 queries come in and sometime it seems the windows from 8081 and 8083 move responding to the query but there are no results. if the queries run on 8081/8082 or 8081/8084 then results come back ok. The query is nothing more than: q=*:* Regards, Matt On Mon, Feb 27, 2012 at 9:26 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I'll have to check on the commit situation. We have been pushing data from SharePoint the last week or so. Would that somehow block the documents moving between the solr instances? I'll try another version tomorrow. Thanks for the suggestions. On Mon, Feb 27, 2012 at 5:34 PM, Mark Miller markrmil...@gmail.com wrote: Hmmm...all of that looks pretty normal... Did a commit somehow fail on the other machine? When you view the stats for the update handler, are there a lot of pending adds for on of the nodes? Do the commit counts match across nodes? You can also query an individual node with distrib=false to check that. If you build is a month old, I'd honestly recommend you try upgrading as well. - Mark On Feb 27, 2012, at 3:34 PM, Matthew Parker wrote: Here is most of the cluster state: Connected to Zookeeper localhost:2181, localhost: 2182, localhost:2183 /(v=0 children=7) /CONFIGS(v=0, children=1) /CONFIGURATION(v=0 children=25) all the configuration files, velocity info, xslt, etc. /NODE_STATES(v=0 children=4) MACHINE1:8083_SOLR (v=121)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8082_SOLR (v=101)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... MACHINE1:8081_SOLR (v=92)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8084_SOLR (v=73)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... /ZOOKEEPER (v-0 children=1) QUOTA(v=0) /CLUSTERSTATE.JSON(V=272){collection1:{shard1:{MACHINE1:8081_solr_:{shard_id:shard1,leader:true,... /LIVE_NODES (v=0 children=4) MACHINE1:8083_SOLR(ephemeral v=0) MACHINE1:8082_SOLR(ephemeral v=0) MACHINE1:8081_SOLR(ephemeral v=0) MACHINE1:8084_SOLR(ephemeral v=0) /COLLECTIONS (v=1 children=1) COLLECTION1(v=0 children=2){configName:configuration1} LEADER_ELECT(v=0 children=2) SHARD1(V=0 children=1) ELECTION(v=0 children=2) 87186203314552835-MACHINE1:8081_SOLR_-N_96(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_84(ephemeral v=0) SHARD2(v=0 children=1) ELECTION(v=0 children=2) 231301391392833539-MACHINE1:8084_SOLR_-N_85(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_84(ephemeral v=0) LEADERS (v=0 children=2) SHARD1 (ephemeral v=0){core:,node_name:MACHINE1:8081_solr,base_url: http://MACHINE1:8081/solr}; SHARD2 (ephemeral v=0){core:,node_name:MACHINE1:8082_solr,base_url: http://MACHINE1:8082/solr}; /OVERSEER_ELECT (v=0 children=2) ELECTION (v=0 children=4) 231301391392833539-MACHINE1:8084_SOLR_-N_000251(ephemeral v=0) 87186203314552835-MACHINE1:8081_SOLR_-N_000248(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_000250(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_000249(ephemeral v=0) LEADER (emphemeral v=0){id:87186203314552835-MACHINE1:8081_solr-n_00248} On Mon, Feb 27, 2012 at 2:47 PM, Mark Miller markrmil...@gmail.com wrote: On Feb 27, 2012, at 2:22 PM, Matthew Parker wrote: Thanks for your reply Mark. I believe the build was towards the begining of the month. The solr.spec.version is 4.0.0.2012.01.10.38.09 I cannot access the clusterstate.json contents. I clicked on it a couple of times, but nothing happens. Is that stored on disk somewhere? Are you using the new admin UI? That has recently been updated to work better with cloud - it had some troubles not too long ago. If you are, you should trying using the old admin UI's zookeeper page - that should show the cluster state. That being said, there has been a lot of bug fixes over the past month - so you may just want to update to a recent version. I configured a custom request handler to calculate an unique document id based on the file's url. On Mon, Feb 27, 2012 at 1:13 PM, Mark Miller markrmil...@gmail.com
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
On Wed, Feb 29, 2012 at 7:03 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I also took out my requestHandler and used the standard /update/extract handler. Same result. How did you install/start the system this time? The same way as earlier? What kind of queries do you run? Would it be possible for you to check out the latest version from svn. In there we have some dev scripts for linux that can be used to setup a test system easily (you need svn, jdk and ant). Essentially the steps would be: #Checkout the sources: svn co http://svn.apache.org/repos/asf/lucene/dev/trunk #build and start solrcloud (1 shard, no replicas) cd solr/cloud-dev sh ./control.sh rebuild sh ./control.sh reinstall 1 sh ./control.sh start 1 #index content java -jar ../example/exampledocs/post.jar ../example/exampledocs/*.xml #after that you can run your queries -- Sami Siren
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
Sami, I have the latest as of the 26th. My system is running on a standalone network so it's not easy to get code updates without a wave of paperwork. I installed as per the detailed instructions I laid out a couple of messages ago from today (2/29/2012). I'm running the following query: http://localhost:8081/solr/collection1/select?q=*:* which gets translated to the following: http://localhost:8081/solr/collection1/select?q=*:*version=2.2start=0rows=10indent=on I just tried it running only two solr nodes, and I get the same results. Regards, Matt On Wed, Feb 29, 2012 at 12:25 PM, Sami Siren ssi...@gmail.com wrote: On Wed, Feb 29, 2012 at 7:03 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I also took out my requestHandler and used the standard /update/extract handler. Same result. How did you install/start the system this time? The same way as earlier? What kind of queries do you run? Would it be possible for you to check out the latest version from svn. In there we have some dev scripts for linux that can be used to setup a test system easily (you need svn, jdk and ant). Essentially the steps would be: #Checkout the sources: svn co http://svn.apache.org/repos/asf/lucene/dev/trunk #build and start solrcloud (1 shard, no replicas) cd solr/cloud-dev sh ./control.sh rebuild sh ./control.sh reinstall 1 sh ./control.sh start 1 #index content java -jar ../example/exampledocs/post.jar ../example/exampledocs/*.xml #after that you can run your queries -- Sami Siren -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration.
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
=solr.processor.SignatureUpdateProcessorFactory bool name=enabledtrue/bool str name=signatureFieldid/str bool name=owerrightDupestrue/bool str name=fieldsurl/str str name=signatureClasssolr.processor.Lookup3Signature/str /processor processor class=solr.LogUpdateProcessorFactory/ processor class=solr.RunUpdateProcessorFactory/ /updateRequestProcessorChain Hopefully this will shed some light on why my configuration is having issues. Thanks for your help. Matt On Tue, Feb 28, 2012 at 8:29 PM, Mark Miller markrmil...@gmail.comwrote: Hmm...this is very strange - there is nothing interesting in any of the logs? In clusterstate.json, all of the shards have an active state? There are quite a few of us doing exactly this setup recently, so there must be something we are missing here... Any info you can offer might help. - Mark On Feb 28, 2012, at 1:00 PM, Matthew Parker wrote: Mark, I got the codebase from the 2/26/2012, and I got the same inconsistent results. I have solr running on four ports 8081-8084 8081 and 8082 are the leaders for shard 1, and shard 2, respectively 8083 - is assigned to shard 1 8084 - is assigned to shard 2 queries come in and sometime it seems the windows from 8081 and 8083 move responding to the query but there are no results. if the queries run on 8081/8082 or 8081/8084 then results come back ok. The query is nothing more than: q=*:* Regards, Matt On Mon, Feb 27, 2012 at 9:26 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I'll have to check on the commit situation. We have been pushing data from SharePoint the last week or so. Would that somehow block the documents moving between the solr instances? I'll try another version tomorrow. Thanks for the suggestions. On Mon, Feb 27, 2012 at 5:34 PM, Mark Miller markrmil...@gmail.com wrote: Hmmm...all of that looks pretty normal... Did a commit somehow fail on the other machine? When you view the stats for the update handler, are there a lot of pending adds for on of the nodes? Do the commit counts match across nodes? You can also query an individual node with distrib=false to check that. If you build is a month old, I'd honestly recommend you try upgrading as well. - Mark On Feb 27, 2012, at 3:34 PM, Matthew Parker wrote: Here is most of the cluster state: Connected to Zookeeper localhost:2181, localhost: 2182, localhost:2183 /(v=0 children=7) /CONFIGS(v=0, children=1) /CONFIGURATION(v=0 children=25) all the configuration files, velocity info, xslt, etc. /NODE_STATES(v=0 children=4) MACHINE1:8083_SOLR (v=121)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8082_SOLR (v=101)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... MACHINE1:8081_SOLR (v=92)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8084_SOLR (v=73)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... /ZOOKEEPER (v-0 children=1) QUOTA(v=0) /CLUSTERSTATE.JSON(V=272){collection1:{shard1:{MACHINE1:8081_solr_:{shard_id:shard1,leader:true,... /LIVE_NODES (v=0 children=4) MACHINE1:8083_SOLR(ephemeral v=0) MACHINE1:8082_SOLR(ephemeral v=0) MACHINE1:8081_SOLR(ephemeral v=0) MACHINE1:8084_SOLR(ephemeral v=0) /COLLECTIONS (v=1 children=1) COLLECTION1(v=0 children=2){configName:configuration1} LEADER_ELECT(v=0 children=2) SHARD1(V=0 children=1) ELECTION(v=0 children=2) 87186203314552835-MACHINE1:8081_SOLR_-N_96(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_84(ephemeral v=0) SHARD2(v=0 children=1) ELECTION(v=0 children=2) 231301391392833539-MACHINE1:8084_SOLR_-N_85(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_84(ephemeral v=0) LEADERS (v=0 children=2) SHARD1 (ephemeral v=0){core:,node_name:MACHINE1:8081_solr,base_url: http://MACHINE1:8081/solr}; SHARD2 (ephemeral v=0){core:,node_name:MACHINE1:8082_solr,base_url: http://MACHINE1:8082/solr}; /OVERSEER_ELECT (v=0 children=2) ELECTION (v=0 children=4) 231301391392833539-MACHINE1:8084_SOLR_-N_000251(ephemeral v=0) 87186203314552835-MACHINE1:8081_SOLR_-N_000248(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_000250(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_000249(ephemeral v=0) LEADER (emphemeral v=0){id:87186203314552835-MACHINE1:8081_solr-n_00248} On Mon, Feb 27, 2012 at 2:47 PM, Mark Miller markrmil...@gmail.com wrote: On Feb 27, 2012, at 2:22 PM, Matthew Parker wrote: Thanks for your reply Mark. I believe the build was towards the begining of the month
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
start.jar java -Djetty.port=8084 -Dhostport=8084 -Dzkhost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar All startup without issue. Step 6 - Modified solrconfig.xml to have a custom request handler === requestHandler name=/update/sharepoint startup=lazy class=solr.extraction.ExtractingRequestHandler lst name=defaults str name=update.chainsharepoint-pipeline/str str name=fmap.contenttext/str str name=lowernamestrue/str str name=uprefixignored/str str name=caputreAttrtrue/str str name=fmap.alinks/str str name=fmap.divignored/str /lst /requestHandler updateRequestProcessorChain name=sharepoint-pipeline processor class=solr.processor.SignatureUpdateProcessorFactory bool name=enabledtrue/bool str name=signatureFieldid/str bool name=owerrightDupestrue/bool str name=fieldsurl/str str name=signatureClasssolr.processor.Lookup3Signature/str /processor processor class=solr.LogUpdateProcessorFactory/ processor class=solr.RunUpdateProcessorFactory/ /updateRequestProcessorChain Hopefully this will shed some light on why my configuration is having issues. Thanks for your help. Matt On Tue, Feb 28, 2012 at 8:29 PM, Mark Miller markrmil...@gmail.com wrote: Hmm...this is very strange - there is nothing interesting in any of the logs? In clusterstate.json, all of the shards have an active state? There are quite a few of us doing exactly this setup recently, so there must be something we are missing here... Any info you can offer might help. - Mark On Feb 28, 2012, at 1:00 PM, Matthew Parker wrote: Mark, I got the codebase from the 2/26/2012, and I got the same inconsistent results. I have solr running on four ports 8081-8084 8081 and 8082 are the leaders for shard 1, and shard 2, respectively 8083 - is assigned to shard 1 8084 - is assigned to shard 2 queries come in and sometime it seems the windows from 8081 and 8083 move responding to the query but there are no results. if the queries run on 8081/8082 or 8081/8084 then results come back ok. The query is nothing more than: q=*:* Regards, Matt On Mon, Feb 27, 2012 at 9:26 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I'll have to check on the commit situation. We have been pushing data from SharePoint the last week or so. Would that somehow block the documents moving between the solr instances? I'll try another version tomorrow. Thanks for the suggestions. On Mon, Feb 27, 2012 at 5:34 PM, Mark Miller markrmil...@gmail.com wrote: Hmmm...all of that looks pretty normal... Did a commit somehow fail on the other machine? When you view the stats for the update handler, are there a lot of pending adds for on of the nodes? Do the commit counts match across nodes? You can also query an individual node with distrib=false to check that. If you build is a month old, I'd honestly recommend you try upgrading as well. - Mark On Feb 27, 2012, at 3:34 PM, Matthew Parker wrote: Here is most of the cluster state: Connected to Zookeeper localhost:2181, localhost: 2182, localhost:2183 /(v=0 children=7) /CONFIGS(v=0, children=1) /CONFIGURATION(v=0 children=25) all the configuration files, velocity info, xslt, etc. /NODE_STATES(v=0 children=4) MACHINE1:8083_SOLR (v=121)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8082_SOLR (v=101)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... MACHINE1:8081_SOLR (v=92)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8084_SOLR (v=73)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... /ZOOKEEPER (v-0 children=1) QUOTA(v=0) /CLUSTERSTATE.JSON(V=272){collection1:{shard1:{MACHINE1:8081_solr_:{shard_id:shard1,leader:true,... /LIVE_NODES (v=0 children=4) MACHINE1:8083_SOLR(ephemeral v=0) MACHINE1:8082_SOLR(ephemeral v=0) MACHINE1:8081_SOLR(ephemeral v=0) MACHINE1:8084_SOLR(ephemeral v=0) /COLLECTIONS (v=1 children=1) COLLECTION1(v=0 children=2){configName:configuration1} LEADER_ELECT(v=0 children=2) SHARD1(V=0 children=1) ELECTION(v=0 children=2) 87186203314552835-MACHINE1:8081_SOLR_-N_96(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_84(ephemeral v=0) SHARD2(v=0 children=1) ELECTION(v=0 children=2) 231301391392833539-MACHINE1:8084_SOLR_-N_85(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_84(ephemeral v=0
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
Mark, I got the codebase from the 2/26/2012, and I got the same inconsistent results. I have solr running on four ports 8081-8084 8081 and 8082 are the leaders for shard 1, and shard 2, respectively 8083 - is assigned to shard 1 8084 - is assigned to shard 2 queries come in and sometime it seems the windows from 8081 and 8083 move responding to the query but there are no results. if the queries run on 8081/8082 or 8081/8084 then results come back ok. The query is nothing more than: q=*:* Regards, Matt On Mon, Feb 27, 2012 at 9:26 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I'll have to check on the commit situation. We have been pushing data from SharePoint the last week or so. Would that somehow block the documents moving between the solr instances? I'll try another version tomorrow. Thanks for the suggestions. On Mon, Feb 27, 2012 at 5:34 PM, Mark Miller markrmil...@gmail.comwrote: Hmmm...all of that looks pretty normal... Did a commit somehow fail on the other machine? When you view the stats for the update handler, are there a lot of pending adds for on of the nodes? Do the commit counts match across nodes? You can also query an individual node with distrib=false to check that. If you build is a month old, I'd honestly recommend you try upgrading as well. - Mark On Feb 27, 2012, at 3:34 PM, Matthew Parker wrote: Here is most of the cluster state: Connected to Zookeeper localhost:2181, localhost: 2182, localhost:2183 /(v=0 children=7) /CONFIGS(v=0, children=1) /CONFIGURATION(v=0 children=25) all the configuration files, velocity info, xslt, etc. /NODE_STATES(v=0 children=4) MACHINE1:8083_SOLR (v=121)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8082_SOLR (v=101)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... MACHINE1:8081_SOLR (v=92)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8084_SOLR (v=73)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... /ZOOKEEPER (v-0 children=1) QUOTA(v=0) /CLUSTERSTATE.JSON(V=272){collection1:{shard1:{MACHINE1:8081_solr_:{shard_id:shard1,leader:true,... /LIVE_NODES (v=0 children=4) MACHINE1:8083_SOLR(ephemeral v=0) MACHINE1:8082_SOLR(ephemeral v=0) MACHINE1:8081_SOLR(ephemeral v=0) MACHINE1:8084_SOLR(ephemeral v=0) /COLLECTIONS (v=1 children=1) COLLECTION1(v=0 children=2){configName:configuration1} LEADER_ELECT(v=0 children=2) SHARD1(V=0 children=1) ELECTION(v=0 children=2) 87186203314552835-MACHINE1:8081_SOLR_-N_96(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_84(ephemeral v=0) SHARD2(v=0 children=1) ELECTION(v=0 children=2) 231301391392833539-MACHINE1:8084_SOLR_-N_85(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_84(ephemeral v=0) LEADERS (v=0 children=2) SHARD1 (ephemeral v=0){core:,node_name:MACHINE1:8081_solr,base_url: http://MACHINE1:8081/solr}; SHARD2 (ephemeral v=0){core:,node_name:MACHINE1:8082_solr,base_url: http://MACHINE1:8082/solr}; /OVERSEER_ELECT (v=0 children=2) ELECTION (v=0 children=4) 231301391392833539-MACHINE1:8084_SOLR_-N_000251(ephemeral v=0) 87186203314552835-MACHINE1:8081_SOLR_-N_000248(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_000250(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_000249(ephemeral v=0) LEADER (emphemeral v=0){id:87186203314552835-MACHINE1:8081_solr-n_00248} On Mon, Feb 27, 2012 at 2:47 PM, Mark Miller markrmil...@gmail.com wrote: On Feb 27, 2012, at 2:22 PM, Matthew Parker wrote: Thanks for your reply Mark. I believe the build was towards the begining of the month. The solr.spec.version is 4.0.0.2012.01.10.38.09 I cannot access the clusterstate.json contents. I clicked on it a couple of times, but nothing happens. Is that stored on disk somewhere? Are you using the new admin UI? That has recently been updated to work better with cloud - it had some troubles not too long ago. If you are, you should trying using the old admin UI's zookeeper page - that should show the cluster state. That being said, there has been a lot of bug fixes over the past month - so you may just want to update to a recent version. I configured a custom request handler to calculate an unique document id based on the file's url. On Mon, Feb 27, 2012 at 1:13 PM, Mark Miller markrmil...@gmail.com wrote: Hey Matt - is your build recent? Can you visit the cloud/zookeeper page in the admin and send the contents of the clusterstate.json node? Are you using a custom index chain or anything out of the ordinary
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
Hmm...this is very strange - there is nothing interesting in any of the logs? In clusterstate.json, all of the shards have an active state? There are quite a few of us doing exactly this setup recently, so there must be something we are missing here... Any info you can offer might help. - Mark On Feb 28, 2012, at 1:00 PM, Matthew Parker wrote: Mark, I got the codebase from the 2/26/2012, and I got the same inconsistent results. I have solr running on four ports 8081-8084 8081 and 8082 are the leaders for shard 1, and shard 2, respectively 8083 - is assigned to shard 1 8084 - is assigned to shard 2 queries come in and sometime it seems the windows from 8081 and 8083 move responding to the query but there are no results. if the queries run on 8081/8082 or 8081/8084 then results come back ok. The query is nothing more than: q=*:* Regards, Matt On Mon, Feb 27, 2012 at 9:26 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I'll have to check on the commit situation. We have been pushing data from SharePoint the last week or so. Would that somehow block the documents moving between the solr instances? I'll try another version tomorrow. Thanks for the suggestions. On Mon, Feb 27, 2012 at 5:34 PM, Mark Miller markrmil...@gmail.comwrote: Hmmm...all of that looks pretty normal... Did a commit somehow fail on the other machine? When you view the stats for the update handler, are there a lot of pending adds for on of the nodes? Do the commit counts match across nodes? You can also query an individual node with distrib=false to check that. If you build is a month old, I'd honestly recommend you try upgrading as well. - Mark On Feb 27, 2012, at 3:34 PM, Matthew Parker wrote: Here is most of the cluster state: Connected to Zookeeper localhost:2181, localhost: 2182, localhost:2183 /(v=0 children=7) /CONFIGS(v=0, children=1) /CONFIGURATION(v=0 children=25) all the configuration files, velocity info, xslt, etc. /NODE_STATES(v=0 children=4) MACHINE1:8083_SOLR (v=121)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8082_SOLR (v=101)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... MACHINE1:8081_SOLR (v=92)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8084_SOLR (v=73)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... /ZOOKEEPER (v-0 children=1) QUOTA(v=0) /CLUSTERSTATE.JSON(V=272){collection1:{shard1:{MACHINE1:8081_solr_:{shard_id:shard1,leader:true,... /LIVE_NODES (v=0 children=4) MACHINE1:8083_SOLR(ephemeral v=0) MACHINE1:8082_SOLR(ephemeral v=0) MACHINE1:8081_SOLR(ephemeral v=0) MACHINE1:8084_SOLR(ephemeral v=0) /COLLECTIONS (v=1 children=1) COLLECTION1(v=0 children=2){configName:configuration1} LEADER_ELECT(v=0 children=2) SHARD1(V=0 children=1) ELECTION(v=0 children=2) 87186203314552835-MACHINE1:8081_SOLR_-N_96(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_84(ephemeral v=0) SHARD2(v=0 children=1) ELECTION(v=0 children=2) 231301391392833539-MACHINE1:8084_SOLR_-N_85(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_84(ephemeral v=0) LEADERS (v=0 children=2) SHARD1 (ephemeral v=0){core:,node_name:MACHINE1:8081_solr,base_url: http://MACHINE1:8081/solr}; SHARD2 (ephemeral v=0){core:,node_name:MACHINE1:8082_solr,base_url: http://MACHINE1:8082/solr}; /OVERSEER_ELECT (v=0 children=2) ELECTION (v=0 children=4) 231301391392833539-MACHINE1:8084_SOLR_-N_000251(ephemeral v=0) 87186203314552835-MACHINE1:8081_SOLR_-N_000248(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_000250(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_000249(ephemeral v=0) LEADER (emphemeral v=0){id:87186203314552835-MACHINE1:8081_solr-n_00248} On Mon, Feb 27, 2012 at 2:47 PM, Mark Miller markrmil...@gmail.com wrote: On Feb 27, 2012, at 2:22 PM, Matthew Parker wrote: Thanks for your reply Mark. I believe the build was towards the begining of the month. The solr.spec.version is 4.0.0.2012.01.10.38.09 I cannot access the clusterstate.json contents. I clicked on it a couple of times, but nothing happens. Is that stored on disk somewhere? Are you using the new admin UI? That has recently been updated to work better with cloud - it had some troubles not too long ago. If you are, you should trying using the old admin UI's zookeeper page - that should show the cluster state. That being said, there has been a lot of bug fixes over the past month - so you may just want to update to a recent version. I configured a custom request handler to calculate an unique document id based on the file's url
Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
TWIMC: Environment = Apache SOLR rev-1236154 Apache Zookeeper 3.3.4 Windows 7 JDK 1.6.0_23.b05 I have built a SOLR Cloud instance with 4 nodes using the embeded Jetty servers. I created a 3 node zookeeper ensemble to manage the solr configuration data. All the instances run on one server so I've had to move ports around for the various applications. I start the 3 zookeeper nodes. I started the first instance of solr cloud with the parameter to have two shards. The start the remaining 3 solr nodes. The system comes up fine. No errors thrown. I can view the solr cloud console and I can see the SOLR configuration files managed by ZooKeeper. I published data into the SOLR Cloud instances from SharePoint using Apache Manifold 0.4-incubating. Manifold is setup to publish the data into collection1, which is the only collection defined in the cluster. When I query the data from collection1 as per the solr wiki, the results are inconsistent. Sometimes all the results are there, other times nothing comes back at all. It seems to be having an issue auto replicating the data across the cloud. Is there some specific setting I might have missed? Based upon what I read, I thought that SOLR cloud would take care of distributing and replicating the data automatically. Do you have to tell it what shard to publish the data into as well? Any help would be appreciated. Thanks, Matt -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration.
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
Hey Matt - is your build recent? Can you visit the cloud/zookeeper page in the admin and send the contents of the clusterstate.json node? Are you using a custom index chain or anything out of the ordinary? - Mark On Feb 27, 2012, at 12:26 PM, Matthew Parker wrote: TWIMC: Environment = Apache SOLR rev-1236154 Apache Zookeeper 3.3.4 Windows 7 JDK 1.6.0_23.b05 I have built a SOLR Cloud instance with 4 nodes using the embeded Jetty servers. I created a 3 node zookeeper ensemble to manage the solr configuration data. All the instances run on one server so I've had to move ports around for the various applications. I start the 3 zookeeper nodes. I started the first instance of solr cloud with the parameter to have two shards. The start the remaining 3 solr nodes. The system comes up fine. No errors thrown. I can view the solr cloud console and I can see the SOLR configuration files managed by ZooKeeper. I published data into the SOLR Cloud instances from SharePoint using Apache Manifold 0.4-incubating. Manifold is setup to publish the data into collection1, which is the only collection defined in the cluster. When I query the data from collection1 as per the solr wiki, the results are inconsistent. Sometimes all the results are there, other times nothing comes back at all. It seems to be having an issue auto replicating the data across the cloud. Is there some specific setting I might have missed? Based upon what I read, I thought that SOLR cloud would take care of distributing and replicating the data automatically. Do you have to tell it what shard to publish the data into as well? Any help would be appreciated. Thanks, Matt -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration. - Mark Miller lucidimagination.com
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
Thanks for your reply Mark. I believe the build was towards the begining of the month. The solr.spec.version is 4.0.0.2012.01.10.38.09 I cannot access the clusterstate.json contents. I clicked on it a couple of times, but nothing happens. Is that stored on disk somewhere? I configured a custom request handler to calculate an unique document id based on the file's url. On Mon, Feb 27, 2012 at 1:13 PM, Mark Miller markrmil...@gmail.com wrote: Hey Matt - is your build recent? Can you visit the cloud/zookeeper page in the admin and send the contents of the clusterstate.json node? Are you using a custom index chain or anything out of the ordinary? - Mark On Feb 27, 2012, at 12:26 PM, Matthew Parker wrote: TWIMC: Environment = Apache SOLR rev-1236154 Apache Zookeeper 3.3.4 Windows 7 JDK 1.6.0_23.b05 I have built a SOLR Cloud instance with 4 nodes using the embeded Jetty servers. I created a 3 node zookeeper ensemble to manage the solr configuration data. All the instances run on one server so I've had to move ports around for the various applications. I start the 3 zookeeper nodes. I started the first instance of solr cloud with the parameter to have two shards. The start the remaining 3 solr nodes. The system comes up fine. No errors thrown. I can view the solr cloud console and I can see the SOLR configuration files managed by ZooKeeper. I published data into the SOLR Cloud instances from SharePoint using Apache Manifold 0.4-incubating. Manifold is setup to publish the data into collection1, which is the only collection defined in the cluster. When I query the data from collection1 as per the solr wiki, the results are inconsistent. Sometimes all the results are there, other times nothing comes back at all. It seems to be having an issue auto replicating the data across the cloud. Is there some specific setting I might have missed? Based upon what I read, I thought that SOLR cloud would take care of distributing and replicating the data automatically. Do you have to tell it what shard to publish the data into as well? Any help would be appreciated. Thanks, Matt -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration. - Mark Miller lucidimagination.com -- Regards, Matt Parker (CTR) Senior Software Architect Apogee Integration, LLC 5180 Parkstone Drive, Suite #160 Chantilly, Virginia 20151 703.272.4797 (site) 703.474.1918 (cell) www.apogeeintegration.com -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration.
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
On Feb 27, 2012, at 2:22 PM, Matthew Parker wrote: Thanks for your reply Mark. I believe the build was towards the begining of the month. The solr.spec.version is 4.0.0.2012.01.10.38.09 I cannot access the clusterstate.json contents. I clicked on it a couple of times, but nothing happens. Is that stored on disk somewhere? Are you using the new admin UI? That has recently been updated to work better with cloud - it had some troubles not too long ago. If you are, you should trying using the old admin UI's zookeeper page - that should show the cluster state. That being said, there has been a lot of bug fixes over the past month - so you may just want to update to a recent version. I configured a custom request handler to calculate an unique document id based on the file's url. On Mon, Feb 27, 2012 at 1:13 PM, Mark Miller markrmil...@gmail.com wrote: Hey Matt - is your build recent? Can you visit the cloud/zookeeper page in the admin and send the contents of the clusterstate.json node? Are you using a custom index chain or anything out of the ordinary? - Mark On Feb 27, 2012, at 12:26 PM, Matthew Parker wrote: TWIMC: Environment = Apache SOLR rev-1236154 Apache Zookeeper 3.3.4 Windows 7 JDK 1.6.0_23.b05 I have built a SOLR Cloud instance with 4 nodes using the embeded Jetty servers. I created a 3 node zookeeper ensemble to manage the solr configuration data. All the instances run on one server so I've had to move ports around for the various applications. I start the 3 zookeeper nodes. I started the first instance of solr cloud with the parameter to have two shards. The start the remaining 3 solr nodes. The system comes up fine. No errors thrown. I can view the solr cloud console and I can see the SOLR configuration files managed by ZooKeeper. I published data into the SOLR Cloud instances from SharePoint using Apache Manifold 0.4-incubating. Manifold is setup to publish the data into collection1, which is the only collection defined in the cluster. When I query the data from collection1 as per the solr wiki, the results are inconsistent. Sometimes all the results are there, other times nothing comes back at all. It seems to be having an issue auto replicating the data across the cloud. Is there some specific setting I might have missed? Based upon what I read, I thought that SOLR cloud would take care of distributing and replicating the data automatically. Do you have to tell it what shard to publish the data into as well? Any help would be appreciated. Thanks, Matt -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration. - Mark Miller lucidimagination.com -- Regards, Matt Parker (CTR) Senior Software Architect Apogee Integration, LLC 5180 Parkstone Drive, Suite #160 Chantilly, Virginia 20151 703.272.4797 (site) 703.474.1918 (cell) www.apogeeintegration.com -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration. - Mark Miller lucidimagination.com
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
I was trying to use the new interface. I see it using the old admin page. Is there a piece of it you're interested in? I don't have access to the Internet where it exists so it would mean transcribing it. On Mon, Feb 27, 2012 at 2:47 PM, Mark Miller markrmil...@gmail.com wrote: On Feb 27, 2012, at 2:22 PM, Matthew Parker wrote: Thanks for your reply Mark. I believe the build was towards the begining of the month. The solr.spec.version is 4.0.0.2012.01.10.38.09 I cannot access the clusterstate.json contents. I clicked on it a couple of times, but nothing happens. Is that stored on disk somewhere? Are you using the new admin UI? That has recently been updated to work better with cloud - it had some troubles not too long ago. If you are, you should trying using the old admin UI's zookeeper page - that should show the cluster state. That being said, there has been a lot of bug fixes over the past month - so you may just want to update to a recent version. I configured a custom request handler to calculate an unique document id based on the file's url. On Mon, Feb 27, 2012 at 1:13 PM, Mark Miller markrmil...@gmail.com wrote: Hey Matt - is your build recent? Can you visit the cloud/zookeeper page in the admin and send the contents of the clusterstate.json node? Are you using a custom index chain or anything out of the ordinary? - Mark On Feb 27, 2012, at 12:26 PM, Matthew Parker wrote: TWIMC: Environment = Apache SOLR rev-1236154 Apache Zookeeper 3.3.4 Windows 7 JDK 1.6.0_23.b05 I have built a SOLR Cloud instance with 4 nodes using the embeded Jetty servers. I created a 3 node zookeeper ensemble to manage the solr configuration data. All the instances run on one server so I've had to move ports around for the various applications. I start the 3 zookeeper nodes. I started the first instance of solr cloud with the parameter to have two shards. The start the remaining 3 solr nodes. The system comes up fine. No errors thrown. I can view the solr cloud console and I can see the SOLR configuration files managed by ZooKeeper. I published data into the SOLR Cloud instances from SharePoint using Apache Manifold 0.4-incubating. Manifold is setup to publish the data into collection1, which is the only collection defined in the cluster. When I query the data from collection1 as per the solr wiki, the results are inconsistent. Sometimes all the results are there, other times nothing comes back at all. It seems to be having an issue auto replicating the data across the cloud. Is there some specific setting I might have missed? Based upon what I read, I thought that SOLR cloud would take care of distributing and replicating the data automatically. Do you have to tell it what shard to publish the data into as well? Any help would be appreciated. Thanks, Matt -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration. - Mark Miller lucidimagination.com -- Regards, Matt Parker (CTR) Senior Software Architect Apogee Integration, LLC 5180 Parkstone Drive, Suite #160 Chantilly, Virginia 20151 703.272.4797 (site) 703.474.1918 (cell) www.apogeeintegration.com -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration. - Mark Miller lucidimagination.com -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration.
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
Here is most of the cluster state: Connected to Zookeeper localhost:2181, localhost: 2182, localhost:2183 /(v=0 children=7) /CONFIGS(v=0, children=1) /CONFIGURATION(v=0 children=25) all the configuration files, velocity info, xslt, etc. /NODE_STATES(v=0 children=4) MACHINE1:8083_SOLR (v=121)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8082_SOLR (v=101)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... MACHINE1:8081_SOLR (v=92)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8084_SOLR (v=73)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... /ZOOKEEPER (v-0 children=1) QUOTA(v=0) /CLUSTERSTATE.JSON(V=272){collection1:{shard1:{MACHINE1:8081_solr_:{shard_id:shard1,leader:true,... /LIVE_NODES (v=0 children=4) MACHINE1:8083_SOLR(ephemeral v=0) MACHINE1:8082_SOLR(ephemeral v=0) MACHINE1:8081_SOLR(ephemeral v=0) MACHINE1:8084_SOLR(ephemeral v=0) /COLLECTIONS (v=1 children=1) COLLECTION1(v=0 children=2){configName:configuration1} LEADER_ELECT(v=0 children=2) SHARD1(V=0 children=1) ELECTION(v=0 children=2) 87186203314552835-MACHINE1:8081_SOLR_-N_96(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_84(ephemeral v=0) SHARD2(v=0 children=1) ELECTION(v=0 children=2) 231301391392833539-MACHINE1:8084_SOLR_-N_85(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_84(ephemeral v=0) LEADERS (v=0 children=2) SHARD1 (ephemeral v=0){core:,node_name:MACHINE1:8081_solr,base_url: http://MACHINE1:8081/solr}; SHARD2 (ephemeral v=0){core:,node_name:MACHINE1:8082_solr,base_url: http://MACHINE1:8082/solr}; /OVERSEER_ELECT (v=0 children=2) ELECTION (v=0 children=4) 231301391392833539-MACHINE1:8084_SOLR_-N_000251(ephemeral v=0) 87186203314552835-MACHINE1:8081_SOLR_-N_000248(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_000250(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_000249(ephemeral v=0) LEADER (emphemeral v=0){id:87186203314552835-MACHINE1:8081_solr-n_00248} On Mon, Feb 27, 2012 at 2:47 PM, Mark Miller markrmil...@gmail.com wrote: On Feb 27, 2012, at 2:22 PM, Matthew Parker wrote: Thanks for your reply Mark. I believe the build was towards the begining of the month. The solr.spec.version is 4.0.0.2012.01.10.38.09 I cannot access the clusterstate.json contents. I clicked on it a couple of times, but nothing happens. Is that stored on disk somewhere? Are you using the new admin UI? That has recently been updated to work better with cloud - it had some troubles not too long ago. If you are, you should trying using the old admin UI's zookeeper page - that should show the cluster state. That being said, there has been a lot of bug fixes over the past month - so you may just want to update to a recent version. I configured a custom request handler to calculate an unique document id based on the file's url. On Mon, Feb 27, 2012 at 1:13 PM, Mark Miller markrmil...@gmail.com wrote: Hey Matt - is your build recent? Can you visit the cloud/zookeeper page in the admin and send the contents of the clusterstate.json node? Are you using a custom index chain or anything out of the ordinary? - Mark On Feb 27, 2012, at 12:26 PM, Matthew Parker wrote: TWIMC: Environment = Apache SOLR rev-1236154 Apache Zookeeper 3.3.4 Windows 7 JDK 1.6.0_23.b05 I have built a SOLR Cloud instance with 4 nodes using the embeded Jetty servers. I created a 3 node zookeeper ensemble to manage the solr configuration data. All the instances run on one server so I've had to move ports around for the various applications. I start the 3 zookeeper nodes. I started the first instance of solr cloud with the parameter to have two shards. The start the remaining 3 solr nodes. The system comes up fine. No errors thrown. I can view the solr cloud console and I can see the SOLR configuration files managed by ZooKeeper. I published data into the SOLR Cloud instances from SharePoint using Apache Manifold 0.4-incubating. Manifold is setup to publish the data into collection1, which is the only collection defined in the cluster. When I query the data from collection1 as per the solr wiki, the results are inconsistent. Sometimes all the results are there, other times nothing comes back at all. It seems to be having an issue auto replicating the data across the cloud. Is there some specific setting I might have missed? Based upon what I read, I thought that SOLR cloud would take care of distributing and replicating the data automatically. Do you have to tell it what shard to publish
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
Hmmm...all of that looks pretty normal... Did a commit somehow fail on the other machine? When you view the stats for the update handler, are there a lot of pending adds for on of the nodes? Do the commit counts match across nodes? You can also query an individual node with distrib=false to check that. If you build is a month old, I'd honestly recommend you try upgrading as well. - Mark On Feb 27, 2012, at 3:34 PM, Matthew Parker wrote: Here is most of the cluster state: Connected to Zookeeper localhost:2181, localhost: 2182, localhost:2183 /(v=0 children=7) /CONFIGS(v=0, children=1) /CONFIGURATION(v=0 children=25) all the configuration files, velocity info, xslt, etc. /NODE_STATES(v=0 children=4) MACHINE1:8083_SOLR (v=121)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8082_SOLR (v=101)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... MACHINE1:8081_SOLR (v=92)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8084_SOLR (v=73)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... /ZOOKEEPER (v-0 children=1) QUOTA(v=0) /CLUSTERSTATE.JSON(V=272){collection1:{shard1:{MACHINE1:8081_solr_:{shard_id:shard1,leader:true,... /LIVE_NODES (v=0 children=4) MACHINE1:8083_SOLR(ephemeral v=0) MACHINE1:8082_SOLR(ephemeral v=0) MACHINE1:8081_SOLR(ephemeral v=0) MACHINE1:8084_SOLR(ephemeral v=0) /COLLECTIONS (v=1 children=1) COLLECTION1(v=0 children=2){configName:configuration1} LEADER_ELECT(v=0 children=2) SHARD1(V=0 children=1) ELECTION(v=0 children=2) 87186203314552835-MACHINE1:8081_SOLR_-N_96(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_84(ephemeral v=0) SHARD2(v=0 children=1) ELECTION(v=0 children=2) 231301391392833539-MACHINE1:8084_SOLR_-N_85(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_84(ephemeral v=0) LEADERS (v=0 children=2) SHARD1 (ephemeral v=0){core:,node_name:MACHINE1:8081_solr,base_url: http://MACHINE1:8081/solr}; SHARD2 (ephemeral v=0){core:,node_name:MACHINE1:8082_solr,base_url: http://MACHINE1:8082/solr}; /OVERSEER_ELECT (v=0 children=2) ELECTION (v=0 children=4) 231301391392833539-MACHINE1:8084_SOLR_-N_000251(ephemeral v=0) 87186203314552835-MACHINE1:8081_SOLR_-N_000248(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_000250(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_000249(ephemeral v=0) LEADER (emphemeral v=0){id:87186203314552835-MACHINE1:8081_solr-n_00248} On Mon, Feb 27, 2012 at 2:47 PM, Mark Miller markrmil...@gmail.com wrote: On Feb 27, 2012, at 2:22 PM, Matthew Parker wrote: Thanks for your reply Mark. I believe the build was towards the begining of the month. The solr.spec.version is 4.0.0.2012.01.10.38.09 I cannot access the clusterstate.json contents. I clicked on it a couple of times, but nothing happens. Is that stored on disk somewhere? Are you using the new admin UI? That has recently been updated to work better with cloud - it had some troubles not too long ago. If you are, you should trying using the old admin UI's zookeeper page - that should show the cluster state. That being said, there has been a lot of bug fixes over the past month - so you may just want to update to a recent version. I configured a custom request handler to calculate an unique document id based on the file's url. On Mon, Feb 27, 2012 at 1:13 PM, Mark Miller markrmil...@gmail.com wrote: Hey Matt - is your build recent? Can you visit the cloud/zookeeper page in the admin and send the contents of the clusterstate.json node? Are you using a custom index chain or anything out of the ordinary? - Mark On Feb 27, 2012, at 12:26 PM, Matthew Parker wrote: TWIMC: Environment = Apache SOLR rev-1236154 Apache Zookeeper 3.3.4 Windows 7 JDK 1.6.0_23.b05 I have built a SOLR Cloud instance with 4 nodes using the embeded Jetty servers. I created a 3 node zookeeper ensemble to manage the solr configuration data. All the instances run on one server so I've had to move ports around for the various applications. I start the 3 zookeeper nodes. I started the first instance of solr cloud with the parameter to have two shards. The start the remaining 3 solr nodes. The system comes up fine. No errors thrown. I can view the solr cloud console and I can see the SOLR configuration files managed by ZooKeeper. I published data into the SOLR Cloud instances from SharePoint using Apache Manifold 0.4-incubating. Manifold is setup to publish the data into collection1, which is the only collection defined in the cluster. When I query the data from collection1
Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes
I'll have to check on the commit situation. We have been pushing data from SharePoint the last week or so. Would that somehow block the documents moving between the solr instances? I'll try another version tomorrow. Thanks for the suggestions. On Mon, Feb 27, 2012 at 5:34 PM, Mark Miller markrmil...@gmail.com wrote: Hmmm...all of that looks pretty normal... Did a commit somehow fail on the other machine? When you view the stats for the update handler, are there a lot of pending adds for on of the nodes? Do the commit counts match across nodes? You can also query an individual node with distrib=false to check that. If you build is a month old, I'd honestly recommend you try upgrading as well. - Mark On Feb 27, 2012, at 3:34 PM, Matthew Parker wrote: Here is most of the cluster state: Connected to Zookeeper localhost:2181, localhost: 2182, localhost:2183 /(v=0 children=7) /CONFIGS(v=0, children=1) /CONFIGURATION(v=0 children=25) all the configuration files, velocity info, xslt, etc. /NODE_STATES(v=0 children=4) MACHINE1:8083_SOLR (v=121)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8082_SOLR (v=101)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... MACHINE1:8081_SOLR (v=92)[{shard_id:shard1, state:active,core:,collection:collection1,node_name:... MACHINE1:8084_SOLR (v=73)[{shard_id:shard2, state:active,core:,collection:collection1,node_name:... /ZOOKEEPER (v-0 children=1) QUOTA(v=0) /CLUSTERSTATE.JSON(V=272){collection1:{shard1:{MACHINE1:8081_solr_:{shard_id:shard1,leader:true,... /LIVE_NODES (v=0 children=4) MACHINE1:8083_SOLR(ephemeral v=0) MACHINE1:8082_SOLR(ephemeral v=0) MACHINE1:8081_SOLR(ephemeral v=0) MACHINE1:8084_SOLR(ephemeral v=0) /COLLECTIONS (v=1 children=1) COLLECTION1(v=0 children=2){configName:configuration1} LEADER_ELECT(v=0 children=2) SHARD1(V=0 children=1) ELECTION(v=0 children=2) 87186203314552835-MACHINE1:8081_SOLR_-N_96(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_84(ephemeral v=0) SHARD2(v=0 children=1) ELECTION(v=0 children=2) 231301391392833539-MACHINE1:8084_SOLR_-N_85(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_84(ephemeral v=0) LEADERS (v=0 children=2) SHARD1 (ephemeral v=0){core:,node_name:MACHINE1:8081_solr,base_url: http://MACHINE1:8081/solr}; SHARD2 (ephemeral v=0){core:,node_name:MACHINE1:8082_solr,base_url: http://MACHINE1:8082/solr}; /OVERSEER_ELECT (v=0 children=2) ELECTION (v=0 children=4) 231301391392833539-MACHINE1:8084_SOLR_-N_000251(ephemeral v=0) 87186203314552835-MACHINE1:8081_SOLR_-N_000248(ephemeral v=0) 159243797356740611-MACHINE1:8082_SOLR_-N_000250(ephemeral v=0) 87186203314552836-MACHINE1:8083_SOLR_-N_000249(ephemeral v=0) LEADER (emphemeral v=0){id:87186203314552835-MACHINE1:8081_solr-n_00248} On Mon, Feb 27, 2012 at 2:47 PM, Mark Miller markrmil...@gmail.com wrote: On Feb 27, 2012, at 2:22 PM, Matthew Parker wrote: Thanks for your reply Mark. I believe the build was towards the begining of the month. The solr.spec.version is 4.0.0.2012.01.10.38.09 I cannot access the clusterstate.json contents. I clicked on it a couple of times, but nothing happens. Is that stored on disk somewhere? Are you using the new admin UI? That has recently been updated to work better with cloud - it had some troubles not too long ago. If you are, you should trying using the old admin UI's zookeeper page - that should show the cluster state. That being said, there has been a lot of bug fixes over the past month - so you may just want to update to a recent version. I configured a custom request handler to calculate an unique document id based on the file's url. On Mon, Feb 27, 2012 at 1:13 PM, Mark Miller markrmil...@gmail.com wrote: Hey Matt - is your build recent? Can you visit the cloud/zookeeper page in the admin and send the contents of the clusterstate.json node? Are you using a custom index chain or anything out of the ordinary? - Mark On Feb 27, 2012, at 12:26 PM, Matthew Parker wrote: TWIMC: Environment = Apache SOLR rev-1236154 Apache Zookeeper 3.3.4 Windows 7 JDK 1.6.0_23.b05 I have built a SOLR Cloud instance with 4 nodes using the embeded Jetty servers. I created a 3 node zookeeper ensemble to manage the solr configuration data. All the instances run on one server so I've had to move ports around for the various applications. I start the 3 zookeeper nodes. I started the first instance of solr cloud with the parameter to have two shards. The start the
Re: inconsistent results when faceting on multivalued field
I think the key here is you are a bit confused about what the multiValued thing is all about. The fq clause says, essentially, restrict all my search results to the documents where 1213206 occurs in sou_codeMetier. That's *all* the fq clause does. Now, by saying facet.field=sou_codeMetier you're asking Solr to count the number of documents that exist for each unique value in that field. A single document can be counted many times. Each bucket is a unique value in the field. On the other hand, saying facet.query=sou_codeMetier:[1213206 TO 1213206] you're asking Solr to count all the documents that make it through your query (*:* in this case) with *any* value in the indicated range. Facet queries really have nothing to do with filter queries. That is, facet queries in no way restrict the documents that are returned, they just indicate ways of counting documents into buckets Best Erick On Fri, Oct 21, 2011 at 10:01 AM, Darren Govoni dar...@ontrenet.com wrote: My interpretation of your results are that your FQ found 1281 documents with 1213206 value in sou_codeMetier field. Of those results, 476 also had 1212104 as a value...and so on. Since ALL the results will have the field value in your FQ, then I would expect the other values to be equal or less occurring from the result set, which they appear to be. On 10/21/2011 03:55 AM, Alain Rogister wrote: Pravesh, Not exactly. Here is the search I do, in more details (different field name, but same issue). I want to get a count for a specific value of the sou_codeMetier field, which is multivalued. I expressed this by including a fq clause : /select/?q=*:*facet=truefacet.field=sou_codeMetierfq=sou_codeMetier:1213206rows=0 The response (excerpt only): lst name=facet_fields lst name=sou_codeMetier int name=12132061281/int int name=1212104476/int int name=121320603285/int int name=1213101260/int int name=121320602208/int int name=121320605171/int int name=1212201152/int ... As you see, I get back both the expected results and extra results I would expect to be filtered out by the fq clause. I can eliminate the extra results with a 'f.sou_codeMetier.facet.prefix=1213206' clause. But I wonder if Solr's behavior is correct and how the fq filtering works exactly. If I replace the facet.field clause with a facet.query clause, like this: /select/?q=*:*facet=truefacet.query=sou_codeMetier:[1213206 TO 1213206]rows=0 The results contain a single item: lst name=facet_queries int name=sou_codeMetier:[1213206 TO 1213206]1281/int /lst The 'fq=sou_codeMetier:1213206' clause isn't necessary here and does not affect the results. Thanks, Alain On Fri, Oct 21, 2011 at 9:18 AM, praveshsuyalprav...@yahoo.com wrote: Could u clarify on below: When I make a search on facet.qua_code=1234567 ?? Are u trying to say, when u fire a fresh search for a facet item, like; q=qua_code:1234567?? This this would fetch for documents where qua_code fields contains either the terms 1234567 OR both terms (1234567 9384738.and others terms). This would be since its a multivalued field and hence if you see the facet, then its shown for both the terms. If I reword the query as 'facet.query=qua_code:1234567 TO 1234567', I only get the expected counts You will get facet for documents which have term 1234567 only (facet.query would apply to the facets,so as to which facet to be picked/shown) Regds Pravesh -- View this message in context: http://lucene.472066.n3.nabble.com/inconsistent-results-when-faceting-on-multivalued-field-tp3438991p3440128.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: inconsistent results when faceting on multivalued field
Could u clarify on below: When I make a search on facet.qua_code=1234567 ?? Are u trying to say, when u fire a fresh search for a facet item, like; q=qua_code:1234567?? This this would fetch for documents where qua_code fields contains either the terms 1234567 OR both terms (1234567 9384738.and others terms). This would be since its a multivalued field and hence if you see the facet, then its shown for both the terms. If I reword the query as 'facet.query=qua_code:1234567 TO 1234567', I only get the expected counts You will get facet for documents which have term 1234567 only (facet.query would apply to the facets,so as to which facet to be picked/shown) Regds Pravesh -- View this message in context: http://lucene.472066.n3.nabble.com/inconsistent-results-when-faceting-on-multivalued-field-tp3438991p3440128.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: inconsistent results when faceting on multivalued field
Pravesh, Not exactly. Here is the search I do, in more details (different field name, but same issue). I want to get a count for a specific value of the sou_codeMetier field, which is multivalued. I expressed this by including a fq clause : /select/?q=*:*facet=truefacet.field=sou_codeMetierfq=sou_codeMetier:1213206rows=0 The response (excerpt only): lst name=facet_fields lst name=sou_codeMetier int name=12132061281/int int name=1212104476/int int name=121320603285/int int name=1213101260/int int name=121320602208/int int name=121320605171/int int name=1212201152/int ... As you see, I get back both the expected results and extra results I would expect to be filtered out by the fq clause. I can eliminate the extra results with a 'f.sou_codeMetier.facet.prefix=1213206' clause. But I wonder if Solr's behavior is correct and how the fq filtering works exactly. If I replace the facet.field clause with a facet.query clause, like this: /select/?q=*:*facet=truefacet.query=sou_codeMetier:[1213206 TO 1213206]rows=0 The results contain a single item: lst name=facet_queries int name=sou_codeMetier:[1213206 TO 1213206]1281/int /lst The 'fq=sou_codeMetier:1213206' clause isn't necessary here and does not affect the results. Thanks, Alain On Fri, Oct 21, 2011 at 9:18 AM, pravesh suyalprav...@yahoo.com wrote: Could u clarify on below: When I make a search on facet.qua_code=1234567 ?? Are u trying to say, when u fire a fresh search for a facet item, like; q=qua_code:1234567?? This this would fetch for documents where qua_code fields contains either the terms 1234567 OR both terms (1234567 9384738.and others terms). This would be since its a multivalued field and hence if you see the facet, then its shown for both the terms. If I reword the query as 'facet.query=qua_code:1234567 TO 1234567', I only get the expected counts You will get facet for documents which have term 1234567 only (facet.query would apply to the facets,so as to which facet to be picked/shown) Regds Pravesh -- View this message in context: http://lucene.472066.n3.nabble.com/inconsistent-results-when-faceting-on-multivalued-field-tp3438991p3440128.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: inconsistent results when faceting on multivalued field
My interpretation of your results are that your FQ found 1281 documents with 1213206 value in sou_codeMetier field. Of those results, 476 also had 1212104 as a value...and so on. Since ALL the results will have the field value in your FQ, then I would expect the other values to be equal or less occurring from the result set, which they appear to be. On 10/21/2011 03:55 AM, Alain Rogister wrote: Pravesh, Not exactly. Here is the search I do, in more details (different field name, but same issue). I want to get a count for a specific value of the sou_codeMetier field, which is multivalued. I expressed this by including a fq clause : /select/?q=*:*facet=truefacet.field=sou_codeMetierfq=sou_codeMetier:1213206rows=0 The response (excerpt only): lst name=facet_fields lst name=sou_codeMetier int name=12132061281/int int name=1212104476/int int name=121320603285/int int name=1213101260/int int name=121320602208/int int name=121320605171/int int name=1212201152/int ... As you see, I get back both the expected results and extra results I would expect to be filtered out by the fq clause. I can eliminate the extra results with a 'f.sou_codeMetier.facet.prefix=1213206' clause. But I wonder if Solr's behavior is correct and how the fq filtering works exactly. If I replace the facet.field clause with a facet.query clause, like this: /select/?q=*:*facet=truefacet.query=sou_codeMetier:[1213206 TO 1213206]rows=0 The results contain a single item: lst name=facet_queries int name=sou_codeMetier:[1213206 TO 1213206]1281/int /lst The 'fq=sou_codeMetier:1213206' clause isn't necessary here and does not affect the results. Thanks, Alain On Fri, Oct 21, 2011 at 9:18 AM, praveshsuyalprav...@yahoo.com wrote: Could u clarify on below: When I make a search on facet.qua_code=1234567 ?? Are u trying to say, when u fire a fresh search for a facet item, like; q=qua_code:1234567?? This this would fetch for documents where qua_code fields contains either the terms 1234567 OR both terms (1234567 9384738.and others terms). This would be since its a multivalued field and hence if you see the facet, then its shown for both the terms. If I reword the query as 'facet.query=qua_code:1234567 TO 1234567', I only get the expected counts You will get facet for documents which have term 1234567 only (facet.query would apply to the facets,so as to which facet to be picked/shown) Regds Pravesh -- View this message in context: http://lucene.472066.n3.nabble.com/inconsistent-results-when-faceting-on-multivalued-field-tp3438991p3440128.html Sent from the Solr - User mailing list archive at Nabble.com.
inconsistent results when faceting on multivalued field
I am surprised by the results I am getting from a search in a Solr 3.4 index. My schema has a multivalued field of type 'string' : field name=qua_code type=string multiValued=true indexed=true stored=true/ The field values are 7-digit or 9-digit integer numbers; this corresponds to a hierarchy. I could have used a numeric type instead of string but no numerical operations are performed against the values. Now, each document contains 0-N values for this field, such as: 8625774 1234567 123456701 123456702 123456703 9384738 When I make a search on facet.qua_code=1234567 , I am getting the counts I expect (seemingly correct) + a large number of counts for *other* field values (e.g. 9384738). If I reword the query as 'facet.query=qua_code:1234567 TO 1234567', I only get the expected counts. I can also filter out the extraneous results with a facet.prefix clause. Should I file an issue or am I misunderstanding something about faceting on multivalued fields ? Thanks.
Local Solr Inconsistent results for radius
Hello, I have a question related to local solr. For certain locations (latitude, longitude), the spatial search does not work. Here is the query I try to make which gives me no results: q=*qt=geosort=geo_distance asclat=33.718151long=73. 060547radius=450 However if I make the same query with radius=449, it gives me results. Here is part of my solrconfig.xml containing startTier and endTier: updateRequestProcessorChain processor class=com.pjaol.search.solr.update.LocalUpdateProcessorFactory str name=latFieldlatitude/str !-- The field used to store your latitude -- str name=lngFieldlongitude/str !-- The field used to store your longitude -- int name=startTier9/int int name=endTier17/int /processor processor class=solr.RunUpdateProcessorFactory / processor class=solr.LogUpdateProcessorFactory / /updateRequestProcessorChain What do I need to do to fix this problem? -- Muhammad Emad Mushtaq http://www.emadmushtaq.com/
Re: Local Solr Inconsistent results for radius
Hi Emad, I had the same issue ( http://old.nabble.com/Spatial---Local-Solr-radius-td26943608.html ), it seems that this happens only on eastern areas of the world. Try inverting the sign of all your longitudes, or translate all your longitudes to the west. Cheers, Mauricio On Fri, Feb 12, 2010 at 7:22 AM, Emad Mushtaq emad.mush...@sigmatec.com.pkwrote: Hello, I have a question related to local solr. For certain locations (latitude, longitude), the spatial search does not work. Here is the query I try to make which gives me no results: q=*qt=geosort=geo_distance asclat=33.718151long=73. 060547radius=450 However if I make the same query with radius=449, it gives me results. Here is part of my solrconfig.xml containing startTier and endTier: updateRequestProcessorChain processor class=com.pjaol.search.solr.update.LocalUpdateProcessorFactory str name=latFieldlatitude/str !-- The field used to store your latitude -- str name=lngFieldlongitude/str !-- The field used to store your longitude -- int name=startTier9/int int name=endTier17/int /processor processor class=solr.RunUpdateProcessorFactory / processor class=solr.LogUpdateProcessorFactory / /updateRequestProcessorChain What do I need to do to fix this problem? -- Muhammad Emad Mushtaq http://www.emadmushtaq.com/
Re: Local Solr Inconsistent results for radius
Hello Mauricio, Do you know why such a problem occurs. Has it to do with certain latitudes, longitudes. If so why is it happening. Is it a bug in local solr? On Fri, Feb 12, 2010 at 5:50 PM, Mauricio Scheffer mauricioschef...@gmail.com wrote: Hi Emad, I had the same issue ( http://old.nabble.com/Spatial---Local-Solr-radius-td26943608.html ), it seems that this happens only on eastern areas of the world. Try inverting the sign of all your longitudes, or translate all your longitudes to the west. Cheers, Mauricio On Fri, Feb 12, 2010 at 7:22 AM, Emad Mushtaq emad.mush...@sigmatec.com.pkwrote: Hello, I have a question related to local solr. For certain locations (latitude, longitude), the spatial search does not work. Here is the query I try to make which gives me no results: q=*qt=geosort=geo_distance asclat=33.718151long=73. 060547radius=450 However if I make the same query with radius=449, it gives me results. Here is part of my solrconfig.xml containing startTier and endTier: updateRequestProcessorChain processor class=com.pjaol.search.solr.update.LocalUpdateProcessorFactory str name=latFieldlatitude/str !-- The field used to store your latitude -- str name=lngFieldlongitude/str !-- The field used to store your longitude -- int name=startTier9/int int name=endTier17/int /processor processor class=solr.RunUpdateProcessorFactory / processor class=solr.LogUpdateProcessorFactory / /updateRequestProcessorChain What do I need to do to fix this problem? -- Muhammad Emad Mushtaq http://www.emadmushtaq.com/ -- Muhammad Emad Mushtaq http://www.emadmushtaq.com/
Re: Local Solr Inconsistent results for radius
Yes, it seems to be a bug, at least with the code you and I are using. If you don't need to search across the whole globe, try translating your longitudes as I suggested. On Fri, Feb 12, 2010 at 3:04 PM, Emad Mushtaq emad.mush...@sigmatec.com.pkwrote: Hello Mauricio, Do you know why such a problem occurs. Has it to do with certain latitudes, longitudes. If so why is it happening. Is it a bug in local solr? On Fri, Feb 12, 2010 at 5:50 PM, Mauricio Scheffer mauricioschef...@gmail.com wrote: Hi Emad, I had the same issue ( http://old.nabble.com/Spatial---Local-Solr-radius-td26943608.html ), it seems that this happens only on eastern areas of the world. Try inverting the sign of all your longitudes, or translate all your longitudes to the west. Cheers, Mauricio On Fri, Feb 12, 2010 at 7:22 AM, Emad Mushtaq emad.mush...@sigmatec.com.pkwrote: Hello, I have a question related to local solr. For certain locations (latitude, longitude), the spatial search does not work. Here is the query I try to make which gives me no results: q=*qt=geosort=geo_distance asclat=33.718151long=73. 060547radius=450 However if I make the same query with radius=449, it gives me results. Here is part of my solrconfig.xml containing startTier and endTier: updateRequestProcessorChain processor class=com.pjaol.search.solr.update.LocalUpdateProcessorFactory str name=latFieldlatitude/str !-- The field used to store your latitude -- str name=lngFieldlongitude/str !-- The field used to store your longitude -- int name=startTier9/int int name=endTier17/int /processor processor class=solr.RunUpdateProcessorFactory / processor class=solr.LogUpdateProcessorFactory / /updateRequestProcessorChain What do I need to do to fix this problem? -- Muhammad Emad Mushtaq http://www.emadmushtaq.com/ -- Muhammad Emad Mushtaq http://www.emadmushtaq.com/
Inconsistent results
Hi,I use SOLR with standard handler and when i send the same exact query to solr i get different results every time (i.e. refresh the page with the query and get different results). Any ideas? Thx,
Re: Inconsistent results in Solr Search with Lucene Index
I fixed that problem with reconfiguring schema.xml. Thanks for your help. Jak Grant Ingersoll yazmış: Have you setup your Analyzers, etc. so they correspond to the exact ones that you were using in Lucene? Under the Solr Admin you can try the analysis tool to see how your index and queries are treated. What happens if you do a *:* query from the Admin query screen? If your index is reasonably sized, I would just reindex, but you shouldn't have to do this. -Grant On Nov 27, 2007, at 8:18 AM, trysteps wrote: Hi All, I am trying to use Solr Search with Lucene Index so just set all schema.xml configs like tokenize and field necessaries. But I can not get results like Lucene. For example , search for 'dog' returns lots of results with lucene but in Solr, I can't get any result. But search with 'dog*' returns same result with Lucene. What is the best way to integrate Lucene index to Solr, are there any well-documented sources? Thanks for your Attention, Trysteps -- Grant Ingersoll http://lucene.grantingersoll.com Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ
Re: Inconsistent results in Solr Search with Lucene Index
Have you setup your Analyzers, etc. so they correspond to the exact ones that you were using in Lucene? Under the Solr Admin you can try the analysis tool to see how your index and queries are treated. What happens if you do a *:* query from the Admin query screen? If your index is reasonably sized, I would just reindex, but you shouldn't have to do this. -Grant On Nov 27, 2007, at 8:18 AM, trysteps wrote: Hi All, I am trying to use Solr Search with Lucene Index so just set all schema.xml configs like tokenize and field necessaries. But I can not get results like Lucene. For example , search for 'dog' returns lots of results with lucene but in Solr, I can't get any result. But search with 'dog*' returns same result with Lucene. What is the best way to integrate Lucene index to Solr, are there any well-documented sources? Thanks for your Attention, Trysteps -- Grant Ingersoll http://lucene.grantingersoll.com Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ