[jira] [Closed] (SOLR-5843) No way to clear error state of a core that doesn't even exist any more
[ https://issues.apache.org/jira/browse/SOLR-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Neulinger closed SOLR-5843. -- Resolution: Fixed > No way to clear error state of a core that doesn't even exist any more > -- > > Key: SOLR-5843 > URL: https://issues.apache.org/jira/browse/SOLR-5843 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.6.1 >Reporter: Nathan Neulinger > Labels: cloud, failure, initialization > > Created collections with missing configs - this is known to create a problem > state. Those collections have all since been deleted -- but one of my nodes > still insists that there are initialization errors. > There are no references to those 'failed' cores in any of the cloud tabs, or > in ZK, or in the directories on the server itself. > There should be some easy way to refresh this state or to clear them out > without having to restart the instance. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6255) Misleading error message when usable questionable update syntax
[ https://issues.apache.org/jira/browse/SOLR-6255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14065922#comment-14065922 ] Nathan Neulinger commented on SOLR-6255: Example document: {noformat} { "at": "2014-07-10T21:28:41Z", "body": "message content here", "channel": [ "dev" ], "from": "ad...@x.com", "hive": "nneul", "id": "4b2c4d09-31e2-4fe2-b767-3868efbdcda1", "message_id": "2014-07-10-4b2c4d09-31e2-4fe2-b767-3868efbdcda1", "subject": "SOLR Testing", "timestamp": 1405027721000, "to": [ "a...@x.com", "b...@x.com", "c...@x.com", "d...@x.com" ], "type": "MESSAGE" } {noformat} > Misleading error message when usable questionable update syntax > --- > > Key: SOLR-6255 > URL: https://issues.apache.org/jira/browse/SOLR-6255 > Project: Solr > Issue Type: Bug > Components: query parsers > Environment: 4.8.0, Linux x86_64, jdk 1.7.55, 2 x Node, External ZK, > SolrCloud >Reporter: Nathan Neulinger > Attachments: schema.xml > > > When issuing an update with the following questionable JSON as input, it > returns (for the attached schema) an error that the required 'timestamp' > field is missing. > [ { "id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1", > "channel": {"add": "preet"}, > "channel": {"add": "adam"} } > ] > Everything I've found so far indicates that in JSON this technically appears > to be allowed, but there isn't any consistency in how any particular library > might interpret it. > Using the more obviously correct format works without error. > [ { "id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1", > "channel": {"add": "preet"} }, >{ "id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1", > "channel": {"add": "adam"} } > ] > Full schema attached, but the following are the only required fields: > stored="true" required="true" multiValued="false" /> > stored="true" required="true" multiValued="false" /> > stored="true" required="true" multiValued="false" omitNorms="true" /> > stored="true" required="true" multiValued="false" omitNorms="true" /> > stored="true" required="true" multiValued="false" omitNorms="true"/> > stored="true" required="true" multiValued="false" omitNorms="true" /> > Channel field: > stored="true" required="false" multiValued="true" omitNorms="true"/> > When I have a bit, I will try to reproduce with a minimally representative > schema, but hopefully you can determine the reason it's parsing the way it is > and have it generate a better error. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6255) Misleading error message when usable questionable update syntax
[ https://issues.apache.org/jira/browse/SOLR-6255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Neulinger updated SOLR-6255: --- Attachment: schema.xml > Misleading error message when usable questionable update syntax > --- > > Key: SOLR-6255 > URL: https://issues.apache.org/jira/browse/SOLR-6255 > Project: Solr > Issue Type: Bug > Components: query parsers > Environment: 4.8.0, Linux x86_64, jdk 1.7.55, 2 x Node, External ZK, > SolrCloud >Reporter: Nathan Neulinger > Attachments: schema.xml > > > When issuing an update with the following questionable JSON as input, it > returns (for the attached schema) an error that the required 'timestamp' > field is missing. > [ { "id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1", > "channel": {"add": "preet"}, > "channel": {"add": "adam"} } > ] > Everything I've found so far indicates that in JSON this technically appears > to be allowed, but there isn't any consistency in how any particular library > might interpret it. > Using the more obviously correct format works without error. > [ { "id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1", > "channel": {"add": "preet"} }, >{ "id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1", > "channel": {"add": "adam"} } > ] > Full schema attached, but the following are the only required fields: > stored="true" required="true" multiValued="false" /> > stored="true" required="true" multiValued="false" /> > stored="true" required="true" multiValued="false" omitNorms="true" /> > stored="true" required="true" multiValued="false" omitNorms="true" /> > stored="true" required="true" multiValued="false" omitNorms="true"/> > stored="true" required="true" multiValued="false" omitNorms="true" /> > Channel field: > stored="true" required="false" multiValued="true" omitNorms="true"/> > When I have a bit, I will try to reproduce with a minimally representative > schema, but hopefully you can determine the reason it's parsing the way it is > and have it generate a better error. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6255) Misleading error message when usable questionable update syntax
Nathan Neulinger created SOLR-6255: -- Summary: Misleading error message when usable questionable update syntax Key: SOLR-6255 URL: https://issues.apache.org/jira/browse/SOLR-6255 Project: Solr Issue Type: Bug Components: query parsers Environment: 4.8.0, Linux x86_64, jdk 1.7.55, 2 x Node, External ZK, SolrCloud Reporter: Nathan Neulinger When issuing an update with the following questionable JSON as input, it returns (for the attached schema) an error that the required 'timestamp' field is missing. [ { "id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1", "channel": {"add": "preet"}, "channel": {"add": "adam"} } ] Everything I've found so far indicates that in JSON this technically appears to be allowed, but there isn't any consistency in how any particular library might interpret it. Using the more obviously correct format works without error. [ { "id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1", "channel": {"add": "preet"} }, { "id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1", "channel": {"add": "adam"} } ] Full schema attached, but the following are the only required fields: Channel field: When I have a bit, I will try to reproduce with a minimally representative schema, but hopefully you can determine the reason it's parsing the way it is and have it generate a better error. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14064477#comment-14064477 ] Nathan Neulinger commented on SOLR-6251: FYI. We finally tracked down the problem at least 99.9% sure at this point, and it was staring me in the face the whole time - just never noticed: {noformat} [{"id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1","channel": {"add": "preet"},"channel": {"add": "adam"}}] {noformat} Look at the JSON... It's trying to add two channels... Should have been: {noformat} [{"id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1","channel": {"add": "preet"}}, {"id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1","channel": {"add": "adam"}}] {noformat} I half wonder how it chose to interpret that particular chunk of json, but either way, I think the origin of our issue is resolved. > incorrect 'missing required field' during update - document definitely has it > - > > Key: SOLR-6251 > URL: https://issues.apache.org/jira/browse/SOLR-6251 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.8 > Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All > on EC2. The two hosts are round-robin'd behind an ELB. >Reporter: Nathan Neulinger > Labels: replication > Attachments: schema.xml > > > Document added on solr1. We can see the distribute take place from solr1 to > solr2 and returning a success. Subsequent searches returning document, > clearly showing the field as being there. Later on, an update is done to add > to an element of the document - and the update fails. The update was sent to > solr2 instance. > Schema marks the 'timestamp' field as required, so the initial insert should > not work if the field isn't present. > Symptom is intermittent - we're seeing this randomly, with no warning or > triggering that we can see, but in all cases, it's getting the error in > response to an update when the instance tries to distribute the change to the > other node. > Searches that were run AFTER the update also show the field as being present > in the document. > Will add full trace of operations in the comments shortly. pcap captures of > ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062909#comment-14062909 ] Nathan Neulinger commented on SOLR-6251: Additionally - since this works 99.9% of the time - I would surely think that a blatant problem as that would have been more visible. The incremental updates work normally without issue, and just randomly fail. > incorrect 'missing required field' during update - document definitely has it > - > > Key: SOLR-6251 > URL: https://issues.apache.org/jira/browse/SOLR-6251 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.8 > Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All > on EC2. The two hosts are round-robin'd behind an ELB. >Reporter: Nathan Neulinger > Labels: replication > Attachments: schema.xml > > > Document added on solr1. We can see the distribute take place from solr1 to > solr2 and returning a success. Subsequent searches returning document, > clearly showing the field as being there. Later on, an update is done to add > to an element of the document - and the update fails. The update was sent to > solr2 instance. > Schema marks the 'timestamp' field as required, so the initial insert should > not work if the field isn't present. > Symptom is intermittent - we're seeing this randomly, with no warning or > triggering that we can see, but in all cases, it's getting the error in > response to an update when the instance tries to distribute the change to the > other node. > Searches that were run AFTER the update also show the field as being present > in the document. > Will add full trace of operations in the comments shortly. pcap captures of > ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062907#comment-14062907 ] Nathan Neulinger commented on SOLR-6251: Leaving closed, but adding more information in case Hoss Man will comment additionally. 'timestamp' is: stored=true indexed=false That seems to meet all of the requirements stated for partial updates unless 'indexed=true' is also required and not documented. > incorrect 'missing required field' during update - document definitely has it > - > > Key: SOLR-6251 > URL: https://issues.apache.org/jira/browse/SOLR-6251 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.8 > Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All > on EC2. The two hosts are round-robin'd behind an ELB. >Reporter: Nathan Neulinger > Labels: replication > Attachments: schema.xml > > > Document added on solr1. We can see the distribute take place from solr1 to > solr2 and returning a success. Subsequent searches returning document, > clearly showing the field as being there. Later on, an update is done to add > to an element of the document - and the update fails. The update was sent to > solr2 instance. > Schema marks the 'timestamp' field as required, so the initial insert should > not work if the field isn't present. > Symptom is intermittent - we're seeing this randomly, with no warning or > triggering that we can see, but in all cases, it's getting the error in > response to an update when the instance tries to distribute the change to the > other node. > Searches that were run AFTER the update also show the field as being present > in the document. > Will add full trace of operations in the comments shortly. pcap captures of > ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Neulinger updated SOLR-6251: --- Attachment: schema.xml schema attached > incorrect 'missing required field' during update - document definitely has it > - > > Key: SOLR-6251 > URL: https://issues.apache.org/jira/browse/SOLR-6251 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.8 > Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All > on EC2. The two hosts are round-robin'd behind an ELB. >Reporter: Nathan Neulinger > Labels: replication > Attachments: schema.xml > > > Document added on solr1. We can see the distribute take place from solr1 to > solr2 and returning a success. Subsequent searches returning document, > clearly showing the field as being there. Later on, an update is done to add > to an element of the document - and the update fails. The update was sent to > solr2 instance. > Schema marks the 'timestamp' field as required, so the initial insert should > not work if the field isn't present. > Symptom is intermittent - we're seeing this randomly, with no warning or > triggering that we can see, but in all cases, it's getting the error in > response to an update when the instance tries to distribute the change to the > other node. > Searches that were run AFTER the update also show the field as being present > in the document. > Will add full trace of operations in the comments shortly. pcap captures of > ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062806#comment-14062806 ] Nathan Neulinger commented on SOLR-6251: We are open to diagnostic suggestions on this, but are at a loss since this appears to be very intermittent and non-reproducible other than by waiting. Looking at solrconfig.xml compared to what is currently in 4.8.0 example - there are a variety of differences, mostly look like due to this config originally being based on 4.4 solrconfig.xml example. > incorrect 'missing required field' during update - document definitely has it > - > > Key: SOLR-6251 > URL: https://issues.apache.org/jira/browse/SOLR-6251 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.8 > Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All > on EC2. The two hosts are round-robin'd behind an ELB. >Reporter: Nathan Neulinger > Labels: replication > > Document added on solr1. We can see the distribute take place from solr1 to > solr2 and returning a success. Subsequent searches returning document, > clearly showing the field as being there. Later on, an update is done to add > to an element of the document - and the update fails. The update was sent to > solr2 instance. > Schema marks the 'timestamp' field as required, so the initial insert should > not work if the field isn't present. > Symptom is intermittent - we're seeing this randomly, with no warning or > triggering that we can see, but in all cases, it's getting the error in > response to an update when the instance tries to distribute the change to the > other node. > Searches that were run AFTER the update also show the field as being present > in the document. > Will add full trace of operations in the comments shortly. pcap captures of > ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062800#comment-14062800 ] Nathan Neulinger edited comment on SOLR-6251 at 7/15/14 10:51 PM: -- and here's an update in that same debug log from shortly before the error (the distribute from the insert of the document on solr1): {noformat} 2014-07-10 21:29:49,313 INFO qtp1599863753-30844 [solr.update.processor.LogUpdateProcessor] - [d-_v22_shard1_replica2] webapp=/solr path=/update params={distrib.from=http://10.220.16.204:8983/solr/d-_v22_shard1_replica1/&update.distrib=TOLEADER&wt=javabin&version=2} {add=[4b2c4d09-31e2-4fe2-b767-3868efbdcda1 (1473278419196182528)]} 0 11 2014-07-10 21:29:49,416 INFO qtp1599863753-30844 [org.apache.solr.update.UpdateHandler] - start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} {noformat} was (Author: nneul): and here's an update in that same debug log from shortly before the error (the distribute from the insert of the document on solr1): 2014-07-10 21:29:49,313 INFO qtp1599863753-30844 [solr.update.processor.LogUpdateProcessor] - [d-_v22_shard1_replica2] webapp=/solr path=/update params={distrib.from=http://10.220.16.204:8983/solr/d-_v22_shard1_replica1/&update.distrib=TOLEADER&wt=javabin&version=2} {add=[4b2c4d09-31e2-4fe2-b767-3868efbdcda1 (1473278419196182528)]} 0 11 2014-07-10 21:29:49,416 INFO qtp1599863753-30844 [org.apache.solr.update.UpdateHandler] - start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} > incorrect 'missing required field' during update - document definitely has it > - > > Key: SOLR-6251 > URL: https://issues.apache.org/jira/browse/SOLR-6251 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.8 > Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All > on EC2. The two hosts are round-robin'd behind an ELB. >Reporter: Nathan Neulinger > Labels: replication > > Document added on solr1. We can see the distribute take place from solr1 to > solr2 and returning a success. Subsequent searches returning document, > clearly showing the field as being there. Later on, an update is done to add > to an element of the document - and the update fails. The update was sent to > solr2 instance. > Schema marks the 'timestamp' field as required, so the initial insert should > not work if the field isn't present. > Symptom is intermittent - we're seeing this randomly, with no warning or > triggering that we can see, but in all cases, it's getting the error in > response to an update when the instance tries to distribute the change to the > other node. > Searches that were run AFTER the update also show the field as being present > in the document. > Will add full trace of operations in the comments shortly. pcap captures of > ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062793#comment-14062793 ] Nathan Neulinger edited comment on SOLR-6251 at 7/15/14 10:52 PM: -- {noformat} 16.24 = POD SRV 16.204 = SOLR 1 16.207 = SOLR 2 16.24 ⇒ 16.204 CAP 1 11344 14:29:49.299883 POST /solr/d-_v22/update/json?commit=true HTTP/1.1 host: d01-solr.srv.hivepoint.com Accept-Encoding: gzip,deflate Content-Type: application/json; charset=UTF-8 request_id: null 8677c2fb-8b92-4220-bb73-1e4c610d95be 2057 User-Agent: HivePoint (Factory JSON client:null:2056) X-Forwarded-For: 10.220.16.229 X-Forwarded-Port: 80 X-Forwarded-Proto: http Content-Length: 1555 Connection: keep-alive { "add": { "commitWithin" : 5000, "doc" : {"hive":"vdates","at":"2014-07-10T21:28:41Z","timestamp":1405027721000,"type":"MESSAGE","channel":["dev"],"from":"pr...@sevogle.com","to":["a...@sevogle.com","vi...@sevogle.com","d...@sevogle.com","s...@hive.sevogle.com"],"subject":"Re: Deployments - B and then C","body":"eve.SNIP...stem. ","id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1","message_id":"2014-07-10-77a6614c-66e4-4ddb-8566-dff4bfb743d1"} } } 16.204 ⇒ 16.207 CAP 1 POST /solr/d-_v22_shard1_replica2/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F10.220.16.204%3A8983%2Fsolr%2Fd-_v22_shard1_replica1%2F&wt=javabin&version=2 HTTP/1.1 User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0 Content-Type: application/javabin Transfer-Encoding: chunked Host: 10.220.16.207:8983 Connection: Keep-Alive 64c ...¶ms...update.distrib(TOLEADER.,distrib.from?.http://10.220.16.204:8983/solr/d-_v22_shard1_replica1/.&delByQ..'docsMap.?$hive&vdates."at42014-07-10T21:28:41Z.)timestampx...$type'MESSAGE.'channel.#dev.$from1pr...@sevogle.com."to.0adam@sevogle.com1vi...@sevogle.com/dev@sevogle.com4...@hive.sevogle.com.'subject>Re: Deployments - B and then C.$body?#eve.SNIP...tem. ."id?.4b2c4d09-31e2-4fe2-b767-3868efbdcda1.*message_id?.2014-07-10-77a6614c-66e4-4ddb-8566-dff4bfb743d1 .."ow.."cwX... 0 16.207 ⇒ 16.204 CAP 1 11368 14:29:49.495301 HTTP/1.1 200 OK Content-Type: application/octet-stream Content-Length: 40 responseHeader..&status..%QTimeK 16.24 ⇒ 16.204 CAP 1 11371 14:29:49.496308 INDEX COMPLETE HTTP/1.1 200 OK Content-Type: text/plain;charset=UTF-8 Transfer-Encoding: chunked 2C {"responseHeader":{"status":0,"QTime":195}} 0 16.24 ⇒ 16.207 CAP 2 9218 14:29:57.065156 9232 14:29”57.099274 Search (two different search results to two servers?) that show the timestamp is set. POST /solr/d-_v22/select?indent=on&wt=json HTTP/1.1 host: d01-solr.srv.hivepoint.com Accept-Encoding: gzip,deflate Content-Type: application/x-www-form-urlencoded; charset=UTF-8 request_id: null 957d1ca5-7200-4058-9c70-16a17fc64c19 2069 User-Agent: HivePoint (Factory JSON client:null:2068) X-Forwarded-For: 10.220.16.229 X-Forwarded-Port: 80 X-Forwarded-Proto: http Content-Length: 244 Connection: keep-alive q=%2B%28*%29&fq=%2Bhive%3Avdates+AND+%2Bchannel%3A%28adam+bethany+dev+notifications+preet+share%29+AND+at%3A%5B2014-07-10T21%3A27%3A56Z+TO+*%5D&start=0&rows=300&sort=at+desc%2C+id+desc&fl=id,hive,timestamp,type,message_id,file_instance_id,scoreHTTP/1.1 200 OK Content-Type: text/plain;charset=UTF-8 Transfer-Encoding: chunked 2BB { "responseHeader":{ "status":0, "QTime":3, "params":{ "fl":"id,hive,timestamp,type,message_id,file_instance_id,score", "sort":"at desc, id desc", "indent":"on", "start":"0", "q":"+(*)", "wt":"json", "fq":"+hive:vdates AND +channel:(adam bethany dev notifications preet share) AND at:[2014-07-10T21:27:56Z TO *]", "rows":"300"}}, "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ { "hive":"vdates", "timestamp":1405027721000, "type":"MESSAGE", "id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1", "message_id":"2014-07-10-77a6614c-66e4-4ddb-8566-dff4bfb743d1", "score":1.0}] }} 0 16.24 ⇒ 16.207 CAP 2 9415 14:30:00.310995 Update Channel POST /solr/d-_v22/update?commit=true HTTP/1.1 host: d01-solr.srv.hivepoint.com Accept-Encoding: gzip,deflate Content-Type: application/json; charset=UTF-8 request_id: null 92fa6c11-78d8-44cc-a143-9ff3e4c132f4 2115 User-Agent: HivePoint (Factory JSON client:null:2114) X-Forwarded-For: 10.220.16.229 X-Forwarded-Port: 80 X-Forwarded-Proto: http Content-Length: 102 Connection: keep-alive [{"id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1","channel": {"add": "preet"},"channel": {"add": "adam"}}]HTTP/1.1 400 Bad Request Content-Type: text/plain;charset=UTF-8 Transfer-Encoding: chunked 96 {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"[doc=4b2c4d09-31e2-4fe2-b767-3868efbdcda1] missing required field: timestamp","code":400}} 0 CAP 2 9602 14:30:08.082758 Subsequent search, after update POST /
[jira] [Commented] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062800#comment-14062800 ] Nathan Neulinger commented on SOLR-6251: and here's an update in that same debug log from shortly before the error (the distribute from the insert of the document on solr1): 2014-07-10 21:29:49,313 INFO qtp1599863753-30844 [solr.update.processor.LogUpdateProcessor] - [d-_v22_shard1_replica2] webapp=/solr path=/update params={distrib.from=http://10.220.16.204:8983/solr/d-_v22_shard1_replica1/&update.distrib=TOLEADER&wt=javabin&version=2} {add=[4b2c4d09-31e2-4fe2-b767-3868efbdcda1 (1473278419196182528)]} 0 11 2014-07-10 21:29:49,416 INFO qtp1599863753-30844 [org.apache.solr.update.UpdateHandler] - start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} > incorrect 'missing required field' during update - document definitely has it > - > > Key: SOLR-6251 > URL: https://issues.apache.org/jira/browse/SOLR-6251 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.8 > Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All > on EC2. The two hosts are round-robin'd behind an ELB. >Reporter: Nathan Neulinger > Labels: replication > > Document added on solr1. We can see the distribute take place from solr1 to > solr2 and returning a success. Subsequent searches returning document, > clearly showing the field as being there. Later on, an update is done to add > to an element of the document - and the update fails. The update was sent to > solr2 instance. > Schema marks the 'timestamp' field as required, so the initial insert should > not work if the field isn't present. > Symptom is intermittent - we're seeing this randomly, with no warning or > triggering that we can see, but in all cases, it's getting the error in > response to an update when the instance tries to distribute the change to the > other node. > Searches that were run AFTER the update also show the field as being present > in the document. > Will add full trace of operations in the comments shortly. pcap captures of > ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062796#comment-14062796 ] Nathan Neulinger commented on SOLR-6251: this is the occurrence of the error on the server the update ran on 2014-07-10 21:30:00,313 ERROR qtp1599863753-30801 [org.apache.solr.core.SolrCore ] - org.apache.solr.common.SolrException: [doc=4b2c4d09-31e2-4fe2-b767-3868efbdcda1] missing required field: timestamp at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:189) at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:77) at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:234) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:160) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:704) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:858) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:557) at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100) at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.handleAdds(JsonLoader.java:393) at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:118) at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:102) at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:66) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) > incorrect 'missi
[jira] [Commented] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062793#comment-14062793 ] Nathan Neulinger commented on SOLR-6251: 16.24 = POD SRV 16.204 = SOLR 1 16.207 = SOLR 2 16.24 ⇒ 16.204 CAP 1 11344 14:29:49.299883 POST /solr/d-_v22/update/json?commit=true HTTP/1.1 host: d01-solr.srv.hivepoint.com Accept-Encoding: gzip,deflate Content-Type: application/json; charset=UTF-8 request_id: null 8677c2fb-8b92-4220-bb73-1e4c610d95be 2057 User-Agent: HivePoint (Factory JSON client:null:2056) X-Forwarded-For: 10.220.16.229 X-Forwarded-Port: 80 X-Forwarded-Proto: http Content-Length: 1555 Connection: keep-alive { "add": { "commitWithin" : 5000, "doc" : {"hive":"vdates","at":"2014-07-10T21:28:41Z","timestamp":1405027721000,"type":"MESSAGE","channel":["dev"],"from":"pr...@sevogle.com","to":["a...@sevogle.com","vi...@sevogle.com","d...@sevogle.com","s...@hive.sevogle.com"],"subject":"Re: Deployments - B and then C","body":"eve.SNIP...stem. ","id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1","message_id":"2014-07-10-77a6614c-66e4-4ddb-8566-dff4bfb743d1"} } } 16.204 ⇒ 16.207 CAP 1 POST /solr/d-_v22_shard1_replica2/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F10.220.16.204%3A8983%2Fsolr%2Fd-_v22_shard1_replica1%2F&wt=javabin&version=2 HTTP/1.1 User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0 Content-Type: application/javabin Transfer-Encoding: chunked Host: 10.220.16.207:8983 Connection: Keep-Alive 64c ...¶ms...update.distrib(TOLEADER.,distrib.from?.http://10.220.16.204:8983/solr/d-_v22_shard1_replica1/.&delByQ..'docsMap.?$hive&vdates."at42014-07-10T21:28:41Z.)timestampx...$type'MESSAGE.'channel.#dev.$from1pr...@sevogle.com."to.0adam@sevogle.com1vi...@sevogle.com/dev@sevogle.com4...@hive.sevogle.com.'subject>Re: Deployments - B and then C.$body?#eve.SNIP...tem. ."id?.4b2c4d09-31e2-4fe2-b767-3868efbdcda1.*message_id?.2014-07-10-77a6614c-66e4-4ddb-8566-dff4bfb743d1 .."ow.."cwX... 0 16.207 ⇒ 16.204 CAP 1 11368 14:29:49.495301 HTTP/1.1 200 OK Content-Type: application/octet-stream Content-Length: 40 responseHeader..&status..%QTimeK 16.24 ⇒ 16.204 CAP 1 11371 14:29:49.496308 INDEX COMPLETE HTTP/1.1 200 OK Content-Type: text/plain;charset=UTF-8 Transfer-Encoding: chunked 2C {"responseHeader":{"status":0,"QTime":195}} 0 16.24 ⇒ 16.207 CAP 2 9218 14:29:57.065156 9232 14:29”57.099274 Search (two different search results to two servers?) that show the timestamp is set. POST /solr/d-_v22/select?indent=on&wt=json HTTP/1.1 host: d01-solr.srv.hivepoint.com Accept-Encoding: gzip,deflate Content-Type: application/x-www-form-urlencoded; charset=UTF-8 request_id: null 957d1ca5-7200-4058-9c70-16a17fc64c19 2069 User-Agent: HivePoint (Factory JSON client:null:2068) X-Forwarded-For: 10.220.16.229 X-Forwarded-Port: 80 X-Forwarded-Proto: http Content-Length: 244 Connection: keep-alive q=%2B%28*%29&fq=%2Bhive%3Avdates+AND+%2Bchannel%3A%28adam+bethany+dev+notifications+preet+share%29+AND+at%3A%5B2014-07-10T21%3A27%3A56Z+TO+*%5D&start=0&rows=300&sort=at+desc%2C+id+desc&fl=id,hive,timestamp,type,message_id,file_instance_id,scoreHTTP/1.1 200 OK Content-Type: text/plain;charset=UTF-8 Transfer-Encoding: chunked 2BB { "responseHeader":{ "status":0, "QTime":3, "params":{ "fl":"id,hive,timestamp,type,message_id,file_instance_id,score", "sort":"at desc, id desc", "indent":"on", "start":"0", "q":"+(*)", "wt":"json", "fq":"+hive:vdates AND +channel:(adam bethany dev notifications preet share) AND at:[2014-07-10T21:27:56Z TO *]", "rows":"300"}}, "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ { "hive":"vdates", "timestamp":1405027721000, "type":"MESSAGE", "id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1", "message_id":"2014-07-10-77a6614c-66e4-4ddb-8566-dff4bfb743d1", "score":1.0}] }} 0 16.24 ⇒ 16.207 CAP 2 9415 14:30:00.310995 Update Channel POST /solr/d-_v22/update?commit=true HTTP/1.1 host: d01-solr.srv.hivepoint.com Accept-Encoding: gzip,deflate Content-Type: application/json; charset=UTF-8 request_id: null 92fa6c11-78d8-44cc-a143-9ff3e4c132f4 2115 User-Agent: HivePoint (Factory JSON client:null:2114) X-Forwarded-For: 10.220.16.229 X-Forwarded-Port: 80 X-Forwarded-Proto: http Content-Length: 102 Connection: keep-alive [{"id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1","channel": {"add": "preet"},"channel": {"add": "adam"}}]HTTP/1.1 400 Bad Request Content-Type: text/plain;charset=UTF-8 Transfer-Encoding: chunked 96 {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"[doc=4b2c4d09-31e2-4fe2-b767-3868efbdcda1] missing required field: timestamp","code":400}} 0 CAP 2 9602 14:30:08.082758 Subsequent search, after update POST /solr/d-_v22/select?indent=on&wt=json HTTP/1.1 host: d01-solr.s
[jira] [Created] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
Nathan Neulinger created SOLR-6251: -- Summary: incorrect 'missing required field' during update - document definitely has it Key: SOLR-6251 URL: https://issues.apache.org/jira/browse/SOLR-6251 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.8 Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All on EC2. The two hosts are round-robin'd behind an ELB. Reporter: Nathan Neulinger Document added on solr1. We can see the distribute take place from solr1 to solr2 and returning a success. Subsequent searches returning document, clearly showing the field as being there. Later on, an update is done to add to an element of the document - and the update fails. The update was sent to solr2 instance. Schema marks the 'timestamp' field as required, so the initial insert should not work if the field isn't present. Symptom is intermittent - we're seeing this randomly, with no warning or triggering that we can see, but in all cases, it's getting the error in response to an update when the instance tries to distribute the change to the other node. Searches that were run AFTER the update also show the field as being present in the document. Will add full trace of operations in the comments shortly. pcap captures of ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5843) No way to clear error state of a core that doesn't even exist any more
[ https://issues.apache.org/jira/browse/SOLR-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925416#comment-13925416 ] Nathan Neulinger commented on SOLR-5843: In case it helps any - interesting result - I restarted the node with the error this evening, and that bogus/broken collection spontaneously tried to recreate itself on that node. No error msg in the log though - just shows as all replicas being down. I issued a delete against it and got this error: 0423org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Server Error request: http://10.220.16.191:8983/solr/admin/cores but, it did clear it out... Not sure what state it was in. If y'all think this is a goofy edge case that isn't likely to re-occur, go ahead and close this. Either way though, I do think there should be a way to tell solr "clear your errors and retry/refresh". > No way to clear error state of a core that doesn't even exist any more > -- > > Key: SOLR-5843 > URL: https://issues.apache.org/jira/browse/SOLR-5843 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.6.1 >Reporter: Nathan Neulinger > Labels: cloud, failure, initialization > > Created collections with missing configs - this is known to create a problem > state. Those collections have all since been deleted -- but one of my nodes > still insists that there are initialization errors. > There are no references to those 'failed' cores in any of the cloud tabs, or > in ZK, or in the directories on the server itself. > There should be some easy way to refresh this state or to clear them out > without having to restart the instance. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5843) No way to clear error state of a core that doesn't even exist any more
[ https://issues.apache.org/jira/browse/SOLR-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925319#comment-13925319 ] Nathan Neulinger commented on SOLR-5843: External 3 node zk ensemble being used. > No way to clear error state of a core that doesn't even exist any more > -- > > Key: SOLR-5843 > URL: https://issues.apache.org/jira/browse/SOLR-5843 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.6.1 >Reporter: Nathan Neulinger > Labels: cloud, failure, initialization > > Created collections with missing configs - this is known to create a problem > state. Those collections have all since been deleted -- but one of my nodes > still insists that there are initialization errors. > There are no references to those 'failed' cores in any of the cloud tabs, or > in ZK, or in the directories on the server itself. > There should be some easy way to refresh this state or to clear them out > without having to restart the instance. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5843) No way to clear error state of a core that doesn't even exist any more
[ https://issues.apache.org/jira/browse/SOLR-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925317#comment-13925317 ] Nathan Neulinger commented on SOLR-5843: Two node SolrCloud 4.6.1 deployment. Do a collection create with a config name that isn't mapped in ZK. You'll get the initialization failures. Now - this part is a bit vague - I don't remember the exact cleanup operations I did - but I think if you go and delete the invalid collection it may or may not make the errors go away. I thought that in previous cases when I issued the calls to delete the improperly created collections that it cleaned the errors up, but it doesn't appear to have in this case. It's possible that one of the nodes was in a weird state at the time, not sure. My current situation is I have two nodes, and ONE of them still has the bogus errors on it, even though all the tabs (and zk tree view) all show no references to the invalid cores. beta1_urlDebug_x_v16_shard1_replica2: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection beta1_urlDebug_x_v16_shard1_replica2 found:[c-v17, default] It's almost like it lost track of the fact that the collection was deleted for the purpose of the error reporting. I also can't find ANY reference to that error in the logs currently on the box, so appears to be in-memory only. > No way to clear error state of a core that doesn't even exist any more > -- > > Key: SOLR-5843 > URL: https://issues.apache.org/jira/browse/SOLR-5843 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.6.1 >Reporter: Nathan Neulinger > Labels: cloud, failure, initialization > > Created collections with missing configs - this is known to create a problem > state. Those collections have all since been deleted -- but one of my nodes > still insists that there are initialization errors. > There are no references to those 'failed' cores in any of the cloud tabs, or > in ZK, or in the directories on the server itself. > There should be some easy way to refresh this state or to clear them out > without having to restart the instance. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5843) No way to clear error state of a core that doesn't even exist any more
Nathan Neulinger created SOLR-5843: -- Summary: No way to clear error state of a core that doesn't even exist any more Key: SOLR-5843 URL: https://issues.apache.org/jira/browse/SOLR-5843 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Reporter: Nathan Neulinger Created collections with missing configs - this is known to create a problem state. Those collections have all since been deleted -- but one of my nodes still insists that there are initialization errors. There are no references to those 'failed' cores in any of the cloud tabs, or in ZK, or in the directories on the server itself. There should be some easy way to refresh this state or to clear them out without having to restart the instance. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (SOLR-5665) Would like a 'REBUILD' method in the collections api to reconstruct missing cores
[ https://issues.apache.org/jira/browse/SOLR-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Neulinger closed SOLR-5665. -- Resolution: Won't Fix Response indicates it'll be addressed with future functionality in a different way. Attached script is available if anyone wants it or you want to stick it in contrib. > Would like a 'REBUILD' method in the collections api to reconstruct missing > cores > - > > Key: SOLR-5665 > URL: https://issues.apache.org/jira/browse/SOLR-5665 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Nathan Neulinger > Attachments: recreate-solr-cores.pl > > > Test Scenario: >Multinode solrCloud deployment >Completely delete one of those nodes >Bring it back online empty (no cores, no indexes at all) >Node will come up with each of the replicas in recovery mode and they will > stay there forever since the cores don't actually exist. > I've written a small external script that goes and gets the > collections/shards/replicas->core mapping from clusterstate.json, and then > calls the core create url for each core that is in a down state. This was > relatively easy to implement externally - and should be even more trivial to > implement inside of the collections api. > I envision this to apply this to all collections: > http://localhost:8983/solr/admin/collections?action=REBUILD > and this to apply the operation to a single collection: > http://localhost:8983/solr/admin/collections?action=REBUILD&name=collname > Independently, I'd like to see this sort of "server is blank, recreate > missing automatically" triggered automatically, but I can see where that > might not be expected behavior. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5665) Would like a 'REBUILD' method in the collections api to reconstruct missing cores
[ https://issues.apache.org/jira/browse/SOLR-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Neulinger updated SOLR-5665: --- Attachment: recreate-solr-cores.pl simple example perl script to do it as an example of how trivial it is. > Would like a 'REBUILD' method in the collections api to reconstruct missing > cores > - > > Key: SOLR-5665 > URL: https://issues.apache.org/jira/browse/SOLR-5665 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Nathan Neulinger > Attachments: recreate-solr-cores.pl > > > Test Scenario: >Multinode solrCloud deployment >Completely delete one of those nodes >Bring it back online empty (no cores, no indexes at all) >Node will come up with each of the replicas in recovery mode and they will > stay there forever since the cores don't actually exist. > I've written a small external script that goes and gets the > collections/shards/replicas->core mapping from clusterstate.json, and then > calls the core create url for each core that is in a down state. This was > relatively easy to implement externally - and should be even more trivial to > implement inside of the collections api. > I envision this to apply this to all collections: > http://localhost:8983/solr/admin/collections?action=REBUILD > and this to apply the operation to a single collection: > http://localhost:8983/solr/admin/collections?action=REBUILD&name=collname > Independently, I'd like to see this sort of "server is blank, recreate > missing automatically" triggered automatically, but I can see where that > might not be expected behavior. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5665) Would like a 'REBUILD' method in the collections api to reconstruct missing cores
Nathan Neulinger created SOLR-5665: -- Summary: Would like a 'REBUILD' method in the collections api to reconstruct missing cores Key: SOLR-5665 URL: https://issues.apache.org/jira/browse/SOLR-5665 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: Nathan Neulinger Test Scenario: Multinode solrCloud deployment Completely delete one of those nodes Bring it back online empty (no cores, no indexes at all) Node will come up with each of the replicas in recovery mode and they will stay there forever since the cores don't actually exist. I've written a small external script that goes and gets the collections/shards/replicas->core mapping from clusterstate.json, and then calls the core create url for each core that is in a down state. This was relatively easy to implement externally - and should be even more trivial to implement inside of the collections api. I envision this to apply this to all collections: http://localhost:8983/solr/admin/collections?action=REBUILD and this to apply the operation to a single collection: http://localhost:8983/solr/admin/collections?action=REBUILD&name=collname Independently, I'd like to see this sort of "server is blank, recreate missing automatically" triggered automatically, but I can see where that might not be expected behavior. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (SOLR-5407) Strange error condition with cloud replication not working quite right
[ https://issues.apache.org/jira/browse/SOLR-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Neulinger closed SOLR-5407. -- Resolution: Unresolved > Strange error condition with cloud replication not working quite right > -- > > Key: SOLR-5407 > URL: https://issues.apache.org/jira/browse/SOLR-5407 > Project: Solr > Issue Type: Bug >Affects Versions: 4.5 >Reporter: Nathan Neulinger > Labels: cloud, replication > > I have a clodu deployment of 4.5 on EC2. Architecture is 3 dedicated ZK > nodes, and a pair of solr nodes. I'll apologize in advance that this error > report is not going to have a lot of detail, I'm really hoping that the > scenario/description will trigger some "likely" possible explanation. > The situation I got into was that the server had decided to fail over, so my > app servers were all taking to what should have been the primary for most of > the shards/collections, but actually was the replica. > Here's where it gets odd - no errors being returned to the client code for > any of the searches or document updates - and the current primary server was > definitely receiving all of the updates - even though they were being > submitted to the inactive/replica node. (clients talking to solr-p1, which > was not primary at the time, and writes were being passed through to solr-r1, > which was primary at the time.) > All sounds good so far right? Except - the replica server at the time, > through which the writes were passing - never got any of those content > updates. It had an old unmodified copy of the index. > I restarted solr-p1 (was the replica at the time) - no change in behavior. > Behavior did not change until I killed and restarted the current primary > (solr-r1) to force it to fail over. > At that point, everything was all happy again and working properly. > Until this morning, when one of the developers provisioned a new collection, > which happened to put it's primary on solr-r1. Again, clients all pointing at > solr-p1. The developer reported that the documents were going into the index, > but not visible on the replica server. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5638) Collection creation partially works, but results in unusable configuration due to missing config in ZK
[ https://issues.apache.org/jira/browse/SOLR-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13881131#comment-13881131 ] Nathan Neulinger commented on SOLR-5638: Will the unload result the core still being on the disk - just not loaded? In which case, what happens when the create collection is requested again and it decides to lay out the replicas in the other order? > Collection creation partially works, but results in unusable configuration > due to missing config in ZK > -- > > Key: SOLR-5638 > URL: https://issues.apache.org/jira/browse/SOLR-5638 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.6 >Reporter: Nathan Neulinger > Attachments: SOLR-5638.patch > > > Need help properly recovering from 'collection gets created without config > being defined'. > Right now, if you submit a collection create and the config is missing, it > will proceed with partially creating cores, but then the cores fail to load. > This requires manual intervention on the server to fix unless you pick a new > colllection name: > What's worse - if you retry the create a second time, it will usually try to > create the replicas in the opposite order, resulting in TWO broken cores on > each box, one for each attempted replica. > beta1-newarch_hive1_v12_shard1_replica1: > org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: > Specified config does not exist in ZooKeeper:hivepoint-unknown > beta1-newarch_hive1_v12_shard1_replica2: > org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: > Specified config does not exist in ZooKeeper:hivepoint-unknown > I already know how to clear this up manually, but this is something where > solr is allowing a condition in external service to result in a > corrupted/partial configuration. > I can see an easy option for resolving this as a workaround - allow a > collection CREATE operation to specify "reuseCores" - i.e. allow it to use > an existing core of the proper name if it already exists. > Right now you wind up getting: > org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error > CREATEing SolrCore 'beta1-newarch_hive1_v12_shard1_replica1': Could not > create a new core in solr/beta1-newarch_hive1_v12_shard1_replica1/as another > core is already defined there > org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error > CREATEing SolrCore 'beta1-newarch_hive1_v12_shard1_replica2': Could not > create a new core in solr/beta1-newarch_hive1_v12_shard1_replica2/as another > core is already defined there -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5638) Collection creation partially works, but results in unusable configuration due to missing config in ZK
[ https://issues.apache.org/jira/browse/SOLR-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874134#comment-13874134 ] Nathan Neulinger commented on SOLR-5638: Alternatively/additionally - solr really should be checking for validity of the requested create. If you ask for a configName, and it doesn't exist - error out then instead of proceeding with the create that is guaranteed to fail as a whole. Procedure to reproduce: do a collection create for a config name that doesn't exist in ZK. > Collection creation partially works, but results in unusable configuration > due to missing config in ZK > -- > > Key: SOLR-5638 > URL: https://issues.apache.org/jira/browse/SOLR-5638 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.6 >Reporter: Nathan Neulinger > > Need help properly recovering from 'collection gets created without config > being defined'. > Right now, if you submit a collection create and the config is missing, it > will proceed with partially creating cores, but then the cores fail to load. > This requires manual intervention on the server to fix unless you pick a new > colllection name: > What's worse - if you retry the create a second time, it will usually try to > create the replicas in the opposite order, resulting in TWO broken cores on > each box, one for each attempted replica. > beta1-newarch_hive1_v12_shard1_replica1: > org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: > Specified config does not exist in ZooKeeper:hivepoint-unknown > beta1-newarch_hive1_v12_shard1_replica2: > org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: > Specified config does not exist in ZooKeeper:hivepoint-unknown > I already know how to clear this up manually, but this is something where > solr is allowing a condition in external service to result in a > corrupted/partial configuration. > I can see an easy option for resolving this as a workaround - allow a > collection CREATE operation to specify "reuseCores" - i.e. allow it to use > an existing core of the proper name if it already exists. > Right now you wind up getting: > org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error > CREATEing SolrCore 'beta1-newarch_hive1_v12_shard1_replica1': Could not > create a new core in solr/beta1-newarch_hive1_v12_shard1_replica1/as another > core is already defined there > org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error > CREATEing SolrCore 'beta1-newarch_hive1_v12_shard1_replica2': Could not > create a new core in solr/beta1-newarch_hive1_v12_shard1_replica2/as another > core is already defined there -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5638) Collection creation partially works, but results in unusable configuration due to missing config in ZK
Nathan Neulinger created SOLR-5638: -- Summary: Collection creation partially works, but results in unusable configuration due to missing config in ZK Key: SOLR-5638 URL: https://issues.apache.org/jira/browse/SOLR-5638 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6 Reporter: Nathan Neulinger Need help properly recovering from 'collection gets created without config being defined'. Right now, if you submit a collection create and the config is missing, it will proceed with partially creating cores, but then the cores fail to load. This requires manual intervention on the server to fix unless you pick a new colllection name: What's worse - if you retry the create a second time, it will usually try to create the replicas in the opposite order, resulting in TWO broken cores on each box, one for each attempted replica. beta1-newarch_hive1_v12_shard1_replica1: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Specified config does not exist in ZooKeeper:hivepoint-unknown beta1-newarch_hive1_v12_shard1_replica2: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Specified config does not exist in ZooKeeper:hivepoint-unknown I already know how to clear this up manually, but this is something where solr is allowing a condition in external service to result in a corrupted/partial configuration. I can see an easy option for resolving this as a workaround - allow a collection CREATE operation to specify "reuseCores" - i.e. allow it to use an existing core of the proper name if it already exists. Right now you wind up getting: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'beta1-newarch_hive1_v12_shard1_replica1': Could not create a new core in solr/beta1-newarch_hive1_v12_shard1_replica1/as another core is already defined there org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'beta1-newarch_hive1_v12_shard1_replica2': Could not create a new core in solr/beta1-newarch_hive1_v12_shard1_replica2/as another core is already defined there -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5626) Would like to be able to 'upconfig' and 'linkconfig' via collections api
Nathan Neulinger created SOLR-5626: -- Summary: Would like to be able to 'upconfig' and 'linkconfig' via collections api Key: SOLR-5626 URL: https://issues.apache.org/jira/browse/SOLR-5626 Project: Solr Issue Type: Wish Components: SolrCloud Reporter: Nathan Neulinger Right now, the collections API isn't self contained - to do server administration, you still have to externally use the zkCli.sh scripts to upload configs. It would be really nice to be able to issue a collections API call to upload a configuration and link that config. I realize that this providing a direct layer into Zk, but the benefit is that at that point - Zk could be completely hidden and behind the scenes - and only exposed to Solr itself. I would suggest that it take a file upload, examples: /solr/admin/collections action=UPCONFIG configName=whatevername configSolrHome=whateverpath configContent=fileUpload/multipart configContentType=typeOfUpload(mimetype?) /solr/admin/collections action=LINKCONFIG collection=collname configName=whatevername You could even extend the CREATE operation/api to take configContent and configContentType file upload, and dynamically create a configName. Then creating a collection becomes a one-shot API call with no outside dependencies. I would suggest that the zip/tar content be specified as rooted at the based of the config dir. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5626) Would like to be able to 'upconfig' and 'linkconfig' via collections api
[ https://issues.apache.org/jira/browse/SOLR-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Neulinger updated SOLR-5626: --- Labels: api collections rest zkcli.sh zookeeper (was: ) > Would like to be able to 'upconfig' and 'linkconfig' via collections api > > > Key: SOLR-5626 > URL: https://issues.apache.org/jira/browse/SOLR-5626 > Project: Solr > Issue Type: Wish > Components: SolrCloud >Reporter: Nathan Neulinger > Labels: api, collections, rest, zkcli.sh, zookeeper > > Right now, the collections API isn't self contained - to do server > administration, you still have to externally use the zkCli.sh scripts to > upload configs. > It would be really nice to be able to issue a collections API call to upload > a configuration and link that config. I realize that this providing a direct > layer into Zk, but the benefit is that at that point - Zk could be completely > hidden and behind the scenes - and only exposed to Solr itself. > I would suggest that it take a file upload, examples: > /solr/admin/collections >action=UPCONFIG >configName=whatevername >configSolrHome=whateverpath >configContent=fileUpload/multipart >configContentType=typeOfUpload(mimetype?) > /solr/admin/collections >action=LINKCONFIG >collection=collname >configName=whatevername > You could even extend the CREATE operation/api to take configContent and > configContentType file upload, and dynamically create a configName. Then > creating a collection becomes a one-shot API call with no outside > dependencies. > I would suggest that the zip/tar content be specified as rooted at the based > of the config dir. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5626) Would like to be able to 'upconfig' and 'linkconfig' via collections api
[ https://issues.apache.org/jira/browse/SOLR-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869695#comment-13869695 ] Nathan Neulinger commented on SOLR-5626: In terms of making the collections API self-sufficient, somewhat related to the request for a 'LIST' operation. > Would like to be able to 'upconfig' and 'linkconfig' via collections api > > > Key: SOLR-5626 > URL: https://issues.apache.org/jira/browse/SOLR-5626 > Project: Solr > Issue Type: Wish > Components: SolrCloud >Reporter: Nathan Neulinger > Labels: api, collections, rest, zkcli.sh, zookeeper > > Right now, the collections API isn't self contained - to do server > administration, you still have to externally use the zkCli.sh scripts to > upload configs. > It would be really nice to be able to issue a collections API call to upload > a configuration and link that config. I realize that this providing a direct > layer into Zk, but the benefit is that at that point - Zk could be completely > hidden and behind the scenes - and only exposed to Solr itself. > I would suggest that it take a file upload, examples: > /solr/admin/collections >action=UPCONFIG >configName=whatevername >configSolrHome=whateverpath >configContent=fileUpload/multipart >configContentType=typeOfUpload(mimetype?) > /solr/admin/collections >action=LINKCONFIG >collection=collname >configName=whatevername > You could even extend the CREATE operation/api to take configContent and > configContentType file upload, and dynamically create a configName. Then > creating a collection becomes a one-shot API call with no outside > dependencies. > I would suggest that the zip/tar content be specified as rooted at the based > of the config dir. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5407) Strange error condition with cloud replication not working quite right
[ https://issues.apache.org/jira/browse/SOLR-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13810192#comment-13810192 ] Nathan Neulinger commented on SOLR-5407: After some further investigation - it seems like this might be related to SOLR-5325 fixed in 4.5.1. We haven't upgraded yet, but have it scheduled. I also raised the zk tick to 5000 and increased timeout to 40 seconds just in case that helps. > Strange error condition with cloud replication not working quite right > -- > > Key: SOLR-5407 > URL: https://issues.apache.org/jira/browse/SOLR-5407 > Project: Solr > Issue Type: Bug >Affects Versions: 4.5 >Reporter: Nathan Neulinger > Labels: cloud, replication > > I have a clodu deployment of 4.5 on EC2. Architecture is 3 dedicated ZK > nodes, and a pair of solr nodes. I'll apologize in advance that this error > report is not going to have a lot of detail, I'm really hoping that the > scenario/description will trigger some "likely" possible explanation. > The situation I got into was that the server had decided to fail over, so my > app servers were all taking to what should have been the primary for most of > the shards/collections, but actually was the replica. > Here's where it gets odd - no errors being returned to the client code for > any of the searches or document updates - and the current primary server was > definitely receiving all of the updates - even though they were being > submitted to the inactive/replica node. (clients talking to solr-p1, which > was not primary at the time, and writes were being passed through to solr-r1, > which was primary at the time.) > All sounds good so far right? Except - the replica server at the time, > through which the writes were passing - never got any of those content > updates. It had an old unmodified copy of the index. > I restarted solr-p1 (was the replica at the time) - no change in behavior. > Behavior did not change until I killed and restarted the current primary > (solr-r1) to force it to fail over. > At that point, everything was all happy again and working properly. > Until this morning, when one of the developers provisioned a new collection, > which happened to put it's primary on solr-r1. Again, clients all pointing at > solr-p1. The developer reported that the documents were going into the index, > but not visible on the replica server. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5407) Strange error condition with cloud replication not working quite right
[ https://issues.apache.org/jira/browse/SOLR-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809818#comment-13809818 ] Nathan Neulinger commented on SOLR-5407: Bigger concern than the initial failure is that the solr deployment sortof acted like everything was up, but was only partially working right. > Strange error condition with cloud replication not working quite right > -- > > Key: SOLR-5407 > URL: https://issues.apache.org/jira/browse/SOLR-5407 > Project: Solr > Issue Type: Bug >Affects Versions: 4.5 >Reporter: Nathan Neulinger > Labels: cloud, replication > > I have a clodu deployment of 4.5 on EC2. Architecture is 3 dedicated ZK > nodes, and a pair of solr nodes. I'll apologize in advance that this error > report is not going to have a lot of detail, I'm really hoping that the > scenario/description will trigger some "likely" possible explanation. > The situation I got into was that the server had decided to fail over, so my > app servers were all taking to what should have been the primary for most of > the shards/collections, but actually was the replica. > Here's where it gets odd - no errors being returned to the client code for > any of the searches or document updates - and the current primary server was > definitely receiving all of the updates - even though they were being > submitted to the inactive/replica node. (clients talking to solr-p1, which > was not primary at the time, and writes were being passed through to solr-r1, > which was primary at the time.) > All sounds good so far right? Except - the replica server at the time, > through which the writes were passing - never got any of those content > updates. It had an old unmodified copy of the index. > I restarted solr-p1 (was the replica at the time) - no change in behavior. > Behavior did not change until I killed and restarted the current primary > (solr-r1) to force it to fail over. > At that point, everything was all happy again and working properly. > Until this morning, when one of the developers provisioned a new collection, > which happened to put it's primary on solr-r1. Again, clients all pointing at > solr-p1. The developer reported that the documents were going into the index, > but not visible on the replica server. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5407) Strange error condition with cloud replication not working quite right
[ https://issues.apache.org/jira/browse/SOLR-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809802#comment-13809802 ] Nathan Neulinger commented on SOLR-5407: This also looks an awful lot like what we saw: http://osdir.com/ml/solr-user.lucene.apache.org/2013-10/msg00673.html > Strange error condition with cloud replication not working quite right > -- > > Key: SOLR-5407 > URL: https://issues.apache.org/jira/browse/SOLR-5407 > Project: Solr > Issue Type: Bug >Affects Versions: 4.5 >Reporter: Nathan Neulinger > Labels: cloud, replication > > I have a clodu deployment of 4.5 on EC2. Architecture is 3 dedicated ZK > nodes, and a pair of solr nodes. I'll apologize in advance that this error > report is not going to have a lot of detail, I'm really hoping that the > scenario/description will trigger some "likely" possible explanation. > The situation I got into was that the server had decided to fail over, so my > app servers were all taking to what should have been the primary for most of > the shards/collections, but actually was the replica. > Here's where it gets odd - no errors being returned to the client code for > any of the searches or document updates - and the current primary server was > definitely receiving all of the updates - even though they were being > submitted to the inactive/replica node. (clients talking to solr-p1, which > was not primary at the time, and writes were being passed through to solr-r1, > which was primary at the time.) > All sounds good so far right? Except - the replica server at the time, > through which the writes were passing - never got any of those content > updates. It had an old unmodified copy of the index. > I restarted solr-p1 (was the replica at the time) - no change in behavior. > Behavior did not change until I killed and restarted the current primary > (solr-r1) to force it to fail over. > At that point, everything was all happy again and working properly. > Until this morning, when one of the developers provisioned a new collection, > which happened to put it's primary on solr-r1. Again, clients all pointing at > solr-p1. The developer reported that the documents were going into the index, > but not visible on the replica server. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5407) Strange error condition with cloud replication not working quite right
[ https://issues.apache.org/jira/browse/SOLR-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809800#comment-13809800 ] Nathan Neulinger commented on SOLR-5407: Found this, will investigate further http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201212.mbox/%3ccabcj+++zzocam0edgv-3xwpfvrpfskoyazjt1xcqm2myht+...@mail.gmail.com%3E talking about raising zookeeper session timeout. > Strange error condition with cloud replication not working quite right > -- > > Key: SOLR-5407 > URL: https://issues.apache.org/jira/browse/SOLR-5407 > Project: Solr > Issue Type: Bug >Affects Versions: 4.5 >Reporter: Nathan Neulinger > Labels: cloud, replication > > I have a clodu deployment of 4.5 on EC2. Architecture is 3 dedicated ZK > nodes, and a pair of solr nodes. I'll apologize in advance that this error > report is not going to have a lot of detail, I'm really hoping that the > scenario/description will trigger some "likely" possible explanation. > The situation I got into was that the server had decided to fail over, so my > app servers were all taking to what should have been the primary for most of > the shards/collections, but actually was the replica. > Here's where it gets odd - no errors being returned to the client code for > any of the searches or document updates - and the current primary server was > definitely receiving all of the updates - even though they were being > submitted to the inactive/replica node. (clients talking to solr-p1, which > was not primary at the time, and writes were being passed through to solr-r1, > which was primary at the time.) > All sounds good so far right? Except - the replica server at the time, > through which the writes were passing - never got any of those content > updates. It had an old unmodified copy of the index. > I restarted solr-p1 (was the replica at the time) - no change in behavior. > Behavior did not change until I killed and restarted the current primary > (solr-r1) to force it to fail over. > At that point, everything was all happy again and working properly. > Until this morning, when one of the developers provisioned a new collection, > which happened to put it's primary on solr-r1. Again, clients all pointing at > solr-p1. The developer reported that the documents were going into the index, > but not visible on the replica server. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5407) Strange error condition with cloud replication not working quite right
[ https://issues.apache.org/jira/browse/SOLR-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809799#comment-13809799 ] Nathan Neulinger commented on SOLR-5407: Digging further, it looks like it all keys around some sort of communications problem with zookeeper - looks like it all started at the end of this log snippet below (reverse time order) when it's reporting that 'Our previous ZooKeeper session was expired. Attempting to reconnect to recover relationship with ZooKeeper'. 2013-10-29T16:25:50.344ZGoing to wait for coreNodeName: core_node2, state: down, checkLive: null, onlyIfLeader: null 2013-10-29T16:25:50.329Zpublishing core=myappqa-master_v8_shard1_replica1 state=down 2013-10-29T16:25:50.329ZCreating new http client, config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false 2013-10-29T16:25:50.328ZWaited coreNodeName: core_node1, state: down, checkLive: null, onlyIfLeader: null for: 1 seconds. 2013-10-29T16:25:49.884ZA cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 1) 2013-10-29T16:25:49.825ZUpdating cloud state from ZooKeeper... 2013-10-29T16:25:49.825ZUpdate state numShards=1 message={ "operation":"state", "state":"down", "base_url":"http://10.170.2.54:8983/solr";, "core":"hiv... 2013-10-29T16:25:49.324ZGoing to wait for coreNodeName: core_node1, state: down, checkLive: null, onlyIfLeader: null 2013-10-29T16:25:49.309ZCreating new http client, config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false 2013-10-29T16:25:49.308Zpublishing core=myappqa-master_v6_shard1_replica2 state=down 2013-10-29T16:25:49.308ZWaited coreNodeName: core_node1, state: down, checkLive: null, onlyIfLeader: null for: 2 seconds. 2013-10-29T16:25:48.302ZA cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 1) 2013-10-29T16:25:48.239ZUpdating cloud state from ZooKeeper... 2013-10-29T16:25:48.239ZUpdate state numShards=1 message={ "operation":"state", "state":"down", "base_url":"http://10.170.2.54:8983/solr";, "core":"hiv... 2013-10-29T16:25:47.304ZGoing to wait for coreNodeName: core_node1, state: down, checkLive: null, onlyIfLeader: null 2013-10-29T16:25:47.289ZCreating new http client, config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false 2013-10-29T16:25:47.289Zpublishing core=myappstaging-profile_v7_shard1_replica1 state=down 2013-10-29T16:25:47.287ZWaited coreNodeName: core_node2, state: down, checkLive: null, onlyIfLeader: null for: 2 seconds. 2013-10-29T16:25:46.469ZA cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 1) 2013-10-29T16:25:46.406ZUpdate state numShards=1 message={ "operation":"state", "state":"down", "base_url":"http://10.170.2.54:8983/solr";, "core":"hiv... 2013-10-29T16:25:45.925ZUpdating cloud state from ZooKeeper... 2013-10-29T16:25:45.286ZGoing to wait for coreNodeName: core_node2, state: down, checkLive: null, onlyIfLeader: null 2013-10-29T16:25:45.270Zpublishing core=myappstaging-profile_v8_shard1_replica1 state=down 2013-10-29T16:25:45.270ZCreating new http client, config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false 2013-10-29T16:25:45.269ZWaited coreNodeName: core_node2, state: down, checkLive: null, onlyIfLeader: null for: 2 seconds. 2013-10-29T16:25:45.039ZA cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 1) 2013-10-29T16:25:44.994ZmakePath: /collections/myappproduction-production_v8/leaders/shard1 2013-10-29T16:25:44.994ZI am the new leader: http://10.136.6.24:8983/solr/myappproduction-production_v8_shard1_replica1/ shard1 2013-10-29T16:25:44.994Z http://10.136.6.24:8983/solr/myappproduction-production_v8_shard1_replica1/ has no replicas 2013-10-29T16:25:44.991ZSync replicas to http://10.136.6.24:8983/solr/myappproduction-production_v8_shard1_replica1/ 2013-10-29T16:25:44.991ZMy last published State was Active, it's okay to be the leader. 2013-10-29T16:25:44.991ZRunning the leader process for shard shard1 2013-10-29T16:25:44.991ZI may be the new leader - try and sync 2013-10-29T16:25:44.991ZSync Success - now sync replicas to me 2013-10-29T16:25:44.991ZChecking if I should try and be the leader. 2013-10-29T16:25:44.940ZI am the new leader: http://10.136.6.24:8983/solr/myappstaging-feature-completion_v9_shard1_replica1/ shard1 2
[jira] [Commented] (SOLR-5405) Cloud graph view not usable by color-blind users - request small tweak
[ https://issues.apache.org/jira/browse/SOLR-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809770#comment-13809770 ] Nathan Neulinger commented on SOLR-5405: Response: I haven't. However, it's notable that I've come to realize that I tend to just tune out information carried in color. There are some good resources around (such as http://www.mollietaylor.com/2012/10/color-blindness-and-palette-choice.html and http://colorschemedesigner.com) on how to choose colors/shades in a way that avoids problems for colorblind folks like me. But I actually like using color in combination with a secondary mechanism (like the X you mentioned) which I think works really well. Just FYI, my biggest problems are when you have certain color/shade combinations together. For example, dark green is hard to differentiate from brown and red. Light green is hard to distinguish from yellow. And medium green is hard to separate from orange. On Wed, Oct 30, 2013 at 4:33 PM, Nathan Neulinger wrote: Noticed anyplace else in the UI where colors are being used for information content that isn't otherwise represented? -- Nathan > Cloud graph view not usable by color-blind users - request small tweak > -- > > Key: SOLR-5405 > URL: https://issues.apache.org/jira/browse/SOLR-5405 > Project: Solr > Issue Type: Improvement > Components: web gui >Affects Versions: 4.5 >Reporter: Nathan Neulinger >Assignee: Stefan Matheis (steffkes) > Labels: accessibility > > Currently, the cloud view status is impossible to see easily on the graph > screen if you are color blind. (On of my coworkers.) > Would it be possible to put " (X)" after the IP of the node where X is > [LARDFG] for the states? -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5407) Strange error condition with cloud replication not working quite right
[ https://issues.apache.org/jira/browse/SOLR-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809767#comment-13809767 ] Nathan Neulinger commented on SOLR-5407: The only error we could find in the logs was this: 09:08:01WARNPeerSyncno frame of reference to tell if we've missed updates 09:25:49WARNOverseer 09:25:49ERROR SolrDispatchFilter null:org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /aliases.json 09:25:49ERROR SolrDispatchFilter null:org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /aliases.json 09:25:49WARNOverseerCollectionProcessor Overseer cannot talk to ZK 09:25:49ERROR SolrDispatchFilter null:org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /aliases.json 09:25:49ERROR SolrDispatchFilter null:org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /aliases.json 09:25:49ERROR SolrDispatchFilter null:org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /aliases.json 09:25:49ERROR SolrDispatchFilter null:org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /aliases.json 09:26:37WARNPeerSyncno frame of reference to tell if we've missed updates > Strange error condition with cloud replication not working quite right > -- > > Key: SOLR-5407 > URL: https://issues.apache.org/jira/browse/SOLR-5407 > Project: Solr > Issue Type: Bug >Affects Versions: 4.5 >Reporter: Nathan Neulinger > Labels: cloud, replication > > I have a clodu deployment of 4.5 on EC2. Architecture is 3 dedicated ZK > nodes, and a pair of solr nodes. I'll apologize in advance that this error > report is not going to have a lot of detail, I'm really hoping that the > scenario/description will trigger some "likely" possible explanation. > The situation I got into was that the server had decided to fail over, so my > app servers were all taking to what should have been the primary for most of > the shards/collections, but actually was the replica. > Here's where it gets odd - no errors being returned to the client code for > any of the searches or document updates - and the current primary server was > definitely receiving all of the updates - even though they were being > submitted to the inactive/replica node. (clients talking to solr-p1, which > was not primary at the time, and writes were being passed through to solr-r1, > which was primary at the time.) > All sounds good so far right? Except - the replica server at the time, > through which the writes were passing - never got any of those content > updates. It had an old unmodified copy of the index. > I restarted solr-p1 (was the replica at the time) - no change in behavior. > Behavior did not change until I killed and restarted the current primary > (solr-r1) to force it to fail over. > At that point, everything was all happy again and working properly. > Until this morning, when one of the developers provisioned a new collection, > which happened to put it's primary on solr-r1. Again, clients all pointing at > solr-p1. The developer reported that the documents were going into the index, > but not visible on the replica server. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5407) Strange error condition with cloud replication not working quite right
Nathan Neulinger created SOLR-5407: -- Summary: Strange error condition with cloud replication not working quite right Key: SOLR-5407 URL: https://issues.apache.org/jira/browse/SOLR-5407 Project: Solr Issue Type: Bug Affects Versions: 4.5 Reporter: Nathan Neulinger I have a clodu deployment of 4.5 on EC2. Architecture is 3 dedicated ZK nodes, and a pair of solr nodes. I'll apologize in advance that this error report is not going to have a lot of detail, I'm really hoping that the scenario/description will trigger some "likely" possible explanation. The situation I got into was that the server had decided to fail over, so my app servers were all taking to what should have been the primary for most of the shards/collections, but actually was the replica. Here's where it gets odd - no errors being returned to the client code for any of the searches or document updates - and the current primary server was definitely receiving all of the updates - even though they were being submitted to the inactive/replica node. (clients talking to solr-p1, which was not primary at the time, and writes were being passed through to solr-r1, which was primary at the time.) All sounds good so far right? Except - the replica server at the time, through which the writes were passing - never got any of those content updates. It had an old unmodified copy of the index. I restarted solr-p1 (was the replica at the time) - no change in behavior. Behavior did not change until I killed and restarted the current primary (solr-r1) to force it to fail over. At that point, everything was all happy again and working properly. Until this morning, when one of the developers provisioned a new collection, which happened to put it's primary on solr-r1. Again, clients all pointing at solr-p1. The developer reported that the documents were going into the index, but not visible on the replica server. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5405) Cloud graph view not usable by color-blind users - request small tweak
[ https://issues.apache.org/jira/browse/SOLR-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809762#comment-13809762 ] Nathan Neulinger commented on SOLR-5405: I passed inquiry along, and will be sure to submit any other easy tweaks to make it more accessible. > Cloud graph view not usable by color-blind users - request small tweak > -- > > Key: SOLR-5405 > URL: https://issues.apache.org/jira/browse/SOLR-5405 > Project: Solr > Issue Type: Improvement > Components: web gui >Affects Versions: 4.5 >Reporter: Nathan Neulinger >Assignee: Stefan Matheis (steffkes) > Labels: accessibility > > Currently, the cloud view status is impossible to see easily on the graph > screen if you are color blind. (On of my coworkers.) > Would it be possible to put " (X)" after the IP of the node where X is > [LARDFG] for the states? -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5405) Cloud graph view not usable by color-blind users - request small tweak
Nathan Neulinger created SOLR-5405: -- Summary: Cloud graph view not usable by color-blind users - request small tweak Key: SOLR-5405 URL: https://issues.apache.org/jira/browse/SOLR-5405 Project: Solr Issue Type: Improvement Affects Versions: 4.5 Reporter: Nathan Neulinger Currently, the cloud view status is impossible to see easily on the graph screen if you are color blind. (On of my coworkers.) Would it be possible to put " (X)" after the IP of the node where X is [LARDFG] for the states? -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud
[ https://issues.apache.org/jira/browse/SOLR-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13788110#comment-13788110 ] Nathan Neulinger commented on SOLR-5307: In my case, I have a dynamic set of collections, so I started everything up with collection1 that is essentially unused, and only exists for the sake of bringing the cluster up and online. Based on that, if other issue didn't do that bootstrap, I can certainly see the issues being the same. > Solr 4.5 collection api ignores collection.configName when used in cloud > > > Key: SOLR-5307 > URL: https://issues.apache.org/jira/browse/SOLR-5307 > Project: Solr > Issue Type: Bug >Affects Versions: 4.5 >Reporter: Nathan Neulinger > Labels: cloud, collection-api, zookeeper > > This worked properly in 4.4, but on 4.5, specifying collection.configName > when creating a collection doesn't work - it gets the default regardless of > what has been uploaded into zk. Explicitly linking config name to collection > ahead of time with zkcli.sh is a workaround I'm using for the moment, but > that did not appear to be necessary with 4.4 unless I was doing something > wrong and not realizing it. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud
[ https://issues.apache.org/jira/browse/SOLR-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13788109#comment-13788109 ] Nathan Neulinger commented on SOLR-5307: Isn't that the point of starting up the first node with the bootstrap zk config as it says in the documentation? > Solr 4.5 collection api ignores collection.configName when used in cloud > > > Key: SOLR-5307 > URL: https://issues.apache.org/jira/browse/SOLR-5307 > Project: Solr > Issue Type: Bug >Affects Versions: 4.5 >Reporter: Nathan Neulinger > Labels: cloud, collection-api, zookeeper > > This worked properly in 4.4, but on 4.5, specifying collection.configName > when creating a collection doesn't work - it gets the default regardless of > what has been uploaded into zk. Explicitly linking config name to collection > ahead of time with zkcli.sh is a workaround I'm using for the moment, but > that did not appear to be necessary with 4.4 unless I was doing something > wrong and not realizing it. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud
[ https://issues.apache.org/jira/browse/SOLR-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13788102#comment-13788102 ] Nathan Neulinger commented on SOLR-5307: Sure sounds similar, though in my case, the collection was created without error - it just got the default configuration (so didn't have our custom schema/config). > Solr 4.5 collection api ignores collection.configName when used in cloud > > > Key: SOLR-5307 > URL: https://issues.apache.org/jira/browse/SOLR-5307 > Project: Solr > Issue Type: Bug >Affects Versions: 4.5 >Reporter: Nathan Neulinger > Labels: cloud, collection-api, zookeeper > > This worked properly in 4.4, but on 4.5, specifying collection.configName > when creating a collection doesn't work - it gets the default regardless of > what has been uploaded into zk. Explicitly linking config name to collection > ahead of time with zkcli.sh is a workaround I'm using for the moment, but > that did not appear to be necessary with 4.4 unless I was doing something > wrong and not realizing it. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud
Nathan Neulinger created SOLR-5307: -- Summary: Solr 4.5 collection api ignores collection.configName when used in cloud Key: SOLR-5307 URL: https://issues.apache.org/jira/browse/SOLR-5307 Project: Solr Issue Type: Bug Affects Versions: 4.5 Reporter: Nathan Neulinger This worked properly in 4.4, but on 4.5, specifying collection.configName when creating a collection doesn't work - it gets the default regardless of what has been uploaded into zk. Explicitly linking config name to collection ahead of time with zkcli.sh is a workaround I'm using for the moment, but that did not appear to be necessary with 4.4 unless I was doing something wrong and not realizing it. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org