At least parts of Solr use semi-custom JSON parsing that allows repeating a map key, so either this particular feature didn't use that parsing technique, or didn't have the logic to kick out the problem, or didn't process it properly. So, I think this is SOME kind of issue on the Solr side, if only better error reporting at a minimum.

-- Jack Krupansky

-----Original Message----- From: Shalin Shekhar Mangar
Sent: Thursday, July 17, 2014 12:40 AM
To: solr-user@lucene.apache.org
Subject: Re: problem with replication/solrcloud - getting 'missing required field' during update intermittently (SOLR-6251)

Phew, thanks for tracking it down.


On Thu, Jul 17, 2014 at 7:50 AM, Nathan Neulinger <nn...@neulinger.org>
wrote:

FYI. We finally tracked down the problem.... at least 99.9% sure at this
point, and it was staring me in the face the whole time - just never
noticed:

[{"id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1","channel": {"add":
"preet"},"channel": {"add": "adam"}}]

Look at the JSON... It's trying to add two "channel" array elements...
Should have been:

[{"id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1","channel": {"add":
"preet"}},
 {"id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1","channel": {"add": "adam"}}]

I half wonder how it chose to interpret that particular chunk of json, but
either way, I think the origin of our issue is resolved.


From what I'm reading on JSON - this isn't valid syntax at all. I'm
guessing that SOLR doesn't actually validate the JSON, and it's parser is
just creating something weird in that situation like a new request for a
whole new document.

-- Nathan



On 07/15/2014 07:19 PM, Nathan Neulinger wrote:

Issue was closed in Jira requesting it be discussed here first. Looking
for any diagnostic assistance on this issue with
4.8.0 since it is intermittent and occurs without warning.

Setup is two nodes, with external zk ensemble. Nodes are accessed
round-robin on EC2 behind an ELB.

Schema has:

<schema name="hive" version="1.5">
...
    <field name="timestamp" type="long" indexed="false" stored="true"
required="true" multiValued="false"
omitNorms="true" />
...


Most of the updates are working without issue, but randomly we'll get the
above failure, even though searches before and
after the update clearly indicate that the document had the timestamp
field in it. The error occurs when the second node
does it's distrib operation against the first node.

Diagnostic details are all in the jira issue. Can provide more as needed,
but would appreciate any suggestions on what
to try or to help diagnose this other than just trying to throw thousands
of requests at it in round-robin between the
two instances to see if it's possible to reproduce the issue.

-- Nathan

------------------------------------------------------------
Nathan Neulinger                       nn...@neulinger.org
Neulinger Consulting                   (573) 612-1412


--
------------------------------------------------------------
Nathan Neulinger                       nn...@neulinger.org
Neulinger Consulting                   (573) 612-1412




--
Regards,
Shalin Shekhar Mangar.

Reply via email to