[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15352788#comment-15352788 ]
ASF subversion and git services commented on SOLR-445: ------------------------------------------------------ Commit adaabaf834964e1674236fca1d4a2801c6cad931 in lucene-solr's branch refs/heads/master from [~shalinmangar] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=adaabaf ] Trivial name spelling fix for SOLR-445 Merge branch 'patch-3' of https://github.com/arafalov/lucene-solr-1 This closes #43 > Update Handlers abort with bad documents > ---------------------------------------- > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update > Reporter: Will Johnson > Assignee: Hoss Man > Fix For: 6.1, master (7.0) > > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > This issue adds a new {{TolerantUpdateProcessorFactory}} making it possible > to configure solr updates so that they are "tolerant" of individual errors in > an update request... > {code} > <processor class="solr.TolerantUpdateProcessorFactory"> > <int name="maxErrors">10</int> > </processor> > {code} > When a chain with this processor is used, but maxErrors isn't exceeded, > here's what the response looks like... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain&wt=json&indent=true&maxErrors=-1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"<EOF>\" at line 1, column 11.\nWas expecting one > of:\n <RANGE_QUOTED> ...\n <RANGE_GOOP> ...\n "}], > "maxErrors":-1, > "status":0, > "QTime":1}} > {code} > Note in the above example that: > * maxErrors can be overridden on a per-request basis > * an effective {{maxErrors==-1}} (either from config, or request param) means > "unlimited" (under the covers it's using {{Integer.MAX_VALUE}}) > If/When maxErrors is reached for a request, then the _first_ exception that > the processor caught is propagated back to the user, and metadata is set on > that exception with all of the same details about all the tolerated errors. > This next example is the same as the previous except that instead of > {{maxErrors=-1}} the request param is now {{maxErrors=1}}... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain&wt=json&indent=true&maxErrors=1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"<EOF>\" at line 1, column 11.\nWas expecting one > of:\n <RANGE_QUOTED> ...\n <RANGE_GOOP> ...\n "}], > "maxErrors":1, > "status":400, > "QTime":1}, > "error":{ > "metadata":[ > "org.apache.solr.common.ToleratedUpdateError--ADD:1","ERROR: [doc=1] > Error adding field 'foo_i'='bogus' msg=For input string: \"bogus\"", > > "org.apache.solr.common.ToleratedUpdateError--DELQ:malformed:[","org.apache.solr.search.SyntaxError: > Cannot parse 'malformed:[': Encountered \"<EOF>\" at line 1, column 11.\nWas > expecting one of:\n <RANGE_QUOTED> ...\n <RANGE_GOOP> ...\n ", > "error-class","org.apache.solr.common.SolrException", > "root-error-class","java.lang.NumberFormatException"], > "msg":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For input > string: \"bogus\"", > "code":400}} > {code} > ...the added exception metadata ensures that even in client code like the > various SolrJ SolrClient implementations, which throw a (client side) > exception on non-200 responses, the end user can access info on all the > tolerated errors that were ignored before the maxErrors threshold was reached. > ---- > {panel:title=Original Jira Request} > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > <add> > <doc> > <field name="id">1</field> > </doc> > <doc> > <field name="id">2</field> > <field name="myDateField">I_AM_A_BAD_DATE</field> > </doc> > <doc> > <field name="id">3</field> > </doc> > </add> > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > > {panel} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org