[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15352798#comment-15352798 ] ASF subversion and git services commented on SOLR-445: -- Commit c8f9973a106c57075601d963f13b5e0997f14f7d in lucene-solr's branch refs/heads/branch_6x from [~shalinmangar] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c8f9973 ] Trivial name spelling fix for SOLR-445. Cherry-picked 8c47d20 > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Reporter: Will Johnson >Assignee: Hoss Man > Fix For: 6.1, master (7.0) > > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > This issue adds a new {{TolerantUpdateProcessorFactory}} making it possible > to configure solr updates so that they are "tolerant" of individual errors in > an update request... > {code} > > 10 > > {code} > When a chain with this processor is used, but maxErrors isn't exceeded, > here's what the response looks like... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=-1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":-1, > "status":0, > "QTime":1}} > {code} > Note in the above example that: > * maxErrors can be overridden on a per-request basis > * an effective {{maxErrors==-1}} (either from config, or request param) means > "unlimited" (under the covers it's using {{Integer.MAX_VALUE}}) > If/When maxErrors is reached for a request, then the _first_ exception that > the processor caught is propagated back to the user, and metadata is set on > that exception with all of the same details about all the tolerated errors. > This next example is the same as the previous except that instead of > {{maxErrors=-1}} the request param is now {{maxErrors=1}}... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":1, > "status":400, > "QTime":1}, > "error":{ > "metadata":[ > "org.apache.solr.common.ToleratedUpdateError--ADD:1","ERROR: [doc=1] > Error adding field 'foo_i'='bogus' msg=For input string: \"bogus\"", > > "org.apache.solr.common.ToleratedUpdateError--DELQ:malformed:[","org.apache.solr.search.SyntaxError: > Cannot parse 'malformed:[': Encountered \"\" at line 1, column 11.\nWas > expecting one of:\n ...\n ...\n", > "error-class","org.apache.solr.common.SolrException", > "root-error-class","java.lang.NumberFormatException"], > "msg":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For input > string: \"bogus\"", > "code":400}} > {code} > ...the added exception metadata ensures that even in client code like the > various SolrJ SolrClient implementations, which throw a (client side) > exception on non-200 responses, the end user can access info on all the > tolerated errors that were ignored before the maxErrors threshold was reached. > > {panel:title=Original Jira Request} > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it >
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15352790#comment-15352790 ] ASF GitHub Bot commented on SOLR-445: - Github user asfgit closed the pull request at: https://github.com/apache/lucene-solr/pull/43 > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Reporter: Will Johnson >Assignee: Hoss Man > Fix For: 6.1, master (7.0) > > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > This issue adds a new {{TolerantUpdateProcessorFactory}} making it possible > to configure solr updates so that they are "tolerant" of individual errors in > an update request... > {code} > > 10 > > {code} > When a chain with this processor is used, but maxErrors isn't exceeded, > here's what the response looks like... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=-1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":-1, > "status":0, > "QTime":1}} > {code} > Note in the above example that: > * maxErrors can be overridden on a per-request basis > * an effective {{maxErrors==-1}} (either from config, or request param) means > "unlimited" (under the covers it's using {{Integer.MAX_VALUE}}) > If/When maxErrors is reached for a request, then the _first_ exception that > the processor caught is propagated back to the user, and metadata is set on > that exception with all of the same details about all the tolerated errors. > This next example is the same as the previous except that instead of > {{maxErrors=-1}} the request param is now {{maxErrors=1}}... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":1, > "status":400, > "QTime":1}, > "error":{ > "metadata":[ > "org.apache.solr.common.ToleratedUpdateError--ADD:1","ERROR: [doc=1] > Error adding field 'foo_i'='bogus' msg=For input string: \"bogus\"", > > "org.apache.solr.common.ToleratedUpdateError--DELQ:malformed:[","org.apache.solr.search.SyntaxError: > Cannot parse 'malformed:[': Encountered \"\" at line 1, column 11.\nWas > expecting one of:\n ...\n ...\n", > "error-class","org.apache.solr.common.SolrException", > "root-error-class","java.lang.NumberFormatException"], > "msg":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For input > string: \"bogus\"", > "code":400}} > {code} > ...the added exception metadata ensures that even in client code like the > various SolrJ SolrClient implementations, which throw a (client side) > exception on non-200 responses, the end user can access info on all the > tolerated errors that were ignored before the maxErrors threshold was reached. > > {panel:title=Original Jira Request} > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15352788#comment-15352788 ] ASF subversion and git services commented on SOLR-445: -- Commit adaabaf834964e1674236fca1d4a2801c6cad931 in lucene-solr's branch refs/heads/master from [~shalinmangar] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=adaabaf ] Trivial name spelling fix for SOLR-445 Merge branch 'patch-3' of https://github.com/arafalov/lucene-solr-1 This closes #43 > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Reporter: Will Johnson >Assignee: Hoss Man > Fix For: 6.1, master (7.0) > > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > This issue adds a new {{TolerantUpdateProcessorFactory}} making it possible > to configure solr updates so that they are "tolerant" of individual errors in > an update request... > {code} > > 10 > > {code} > When a chain with this processor is used, but maxErrors isn't exceeded, > here's what the response looks like... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=-1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":-1, > "status":0, > "QTime":1}} > {code} > Note in the above example that: > * maxErrors can be overridden on a per-request basis > * an effective {{maxErrors==-1}} (either from config, or request param) means > "unlimited" (under the covers it's using {{Integer.MAX_VALUE}}) > If/When maxErrors is reached for a request, then the _first_ exception that > the processor caught is propagated back to the user, and metadata is set on > that exception with all of the same details about all the tolerated errors. > This next example is the same as the previous except that instead of > {{maxErrors=-1}} the request param is now {{maxErrors=1}}... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":1, > "status":400, > "QTime":1}, > "error":{ > "metadata":[ > "org.apache.solr.common.ToleratedUpdateError--ADD:1","ERROR: [doc=1] > Error adding field 'foo_i'='bogus' msg=For input string: \"bogus\"", > > "org.apache.solr.common.ToleratedUpdateError--DELQ:malformed:[","org.apache.solr.search.SyntaxError: > Cannot parse 'malformed:[': Encountered \"\" at line 1, column 11.\nWas > expecting one of:\n ...\n ...\n", > "error-class","org.apache.solr.common.SolrException", > "root-error-class","java.lang.NumberFormatException"], > "msg":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For input > string: \"bogus\"", > "code":400}} > {code} > ...the added exception metadata ensures that even in client code like the > various SolrJ SolrClient implementations, which throw a (client side) > exception on non-200 responses, the end user can access info on all the > tolerated errors that were ignored before the maxErrors threshold was reached. > > {panel:title=Original Jira Request} > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327357#comment-15327357 ] ASF GitHub Bot commented on SOLR-445: - GitHub user arafalov opened a pull request: https://github.com/apache/lucene-solr/pull/43 Trivial name spelling fix for SOLR-445 ToleranteUpdateProcessorFactory -> ToleranteUpdateProcessorFactory You can merge this pull request into a Git repository by running: $ git pull https://github.com/arafalov/lucene-solr-1 patch-3 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/43.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #43 commit 6742355f93f0d2d03600fe408b542507ee89bf54 Author: Alexandre RafalovitchDate: 2016-06-13T13:19:25Z Trivial Spelling fix ToleranteUpdateProcessorFactory -> TolerantUpdateProcessorFactory commit ebffa9aa2aebd689db53ba363d5022b893c7eeb0 Author: Alexandre Rafalovitch Date: 2016-06-13T13:22:49Z Trivial Spelling fix ToleranteUpdateProcessorFactory -> TolerantUpdateProcessorFactory > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Reporter: Will Johnson >Assignee: Hoss Man > Fix For: 6.1, master (7.0) > > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > This issue adds a new {{TolerantUpdateProcessorFactory}} making it possible > to configure solr updates so that they are "tolerant" of individual errors in > an update request... > {code} > > 10 > > {code} > When a chain with this processor is used, but maxErrors isn't exceeded, > here's what the response looks like... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=-1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":-1, > "status":0, > "QTime":1}} > {code} > Note in the above example that: > * maxErrors can be overridden on a per-request basis > * an effective {{maxErrors==-1}} (either from config, or request param) means > "unlimited" (under the covers it's using {{Integer.MAX_VALUE}}) > If/When maxErrors is reached for a request, then the _first_ exception that > the processor caught is propagated back to the user, and metadata is set on > that exception with all of the same details about all the tolerated errors. > This next example is the same as the previous except that instead of > {{maxErrors=-1}} the request param is now {{maxErrors=1}}... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":1, > "status":400, > "QTime":1}, > "error":{ > "metadata":[ > "org.apache.solr.common.ToleratedUpdateError--ADD:1","ERROR: [doc=1] > Error adding field 'foo_i'='bogus' msg=For input string: \"bogus\"", > > "org.apache.solr.common.ToleratedUpdateError--DELQ:malformed:[","org.apache.solr.search.SyntaxError: > Cannot parse 'malformed:[': Encountered \"\" at line 1, column 11.\nWas > expecting one of:\n ...\n ...\n", >
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212418#comment-15212418 ] ASF subversion and git services commented on SOLR-445: -- Commit b8c0ff66f958e5e199874059b0427ea267778c3a in lucene-solr's branch refs/heads/branch_6x from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=b8c0ff6 ] SOLR-445: Merge remote-tracking branch 'refs/remotes/origin/branch_6x' into branch_6x (picking up mid backport conflicts) > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Reporter: Will Johnson >Assignee: Hoss Man > Fix For: master, 6.1 > > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > This issue adds a new {{TolerantUpdateProcessorFactory}} making it possible > to configure solr updates so that they are "tolerant" of individual errors in > an update request... > {code} > > 10 > > {code} > When a chain with this processor is used, but maxErrors isn't exceeded, > here's what the response looks like... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=-1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":-1, > "status":0, > "QTime":1}} > {code} > Note in the above example that: > * maxErrors can be overridden on a per-request basis > * an effective {{maxErrors==-1}} (either from config, or request param) means > "unlimited" (under the covers it's using {{Integer.MAX_VALUE}}) > If/When maxErrors is reached for a request, then the _first_ exception that > the processor caught is propagated back to the user, and metadata is set on > that exception with all of the same details about all the tolerated errors. > This next example is the same as the previous except that instead of > {{maxErrors=-1}} the request param is now {{maxErrors=1}}... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":1, > "status":400, > "QTime":1}, > "error":{ > "metadata":[ > "org.apache.solr.common.ToleratedUpdateError--ADD:1","ERROR: [doc=1] > Error adding field 'foo_i'='bogus' msg=For input string: \"bogus\"", > > "org.apache.solr.common.ToleratedUpdateError--DELQ:malformed:[","org.apache.solr.search.SyntaxError: > Cannot parse 'malformed:[': Encountered \"\" at line 1, column 11.\nWas > expecting one of:\n ...\n ...\n", > "error-class","org.apache.solr.common.SolrException", > "root-error-class","java.lang.NumberFormatException"], > "msg":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For input > string: \"bogus\"", > "code":400}} > {code} > ...the added exception metadata ensures that even in client code like the > various SolrJ SolrClient implementations, which throw a (client side) > exception on non-200 responses, the end user can access info on all the > tolerated errors that were ignored before the maxErrors threshold was reached. > > {panel:title=Original Jira Request} > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212417#comment-15212417 ] ASF subversion and git services commented on SOLR-445: -- Commit 5b6eacb80bca5815059cd50a1646fa4ecb146e43 in lucene-solr's branch refs/heads/branch_6x from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5b6eacb ] SOLR-445: new ToleranteUpdateProcessorFactory to support skipping update commands that cause failures when sending multiple updates in a single request. SOLR-8890: New static method in DistributedUpdateProcessorFactory to allow UpdateProcessorFactories to indicate request params that should be forwarded when DUP distributes updates. This commit is a squashed merge from the jira/SOLR-445 branch (as of b08c284b26b1779d03693a45e219db89839461d0) > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Reporter: Will Johnson >Assignee: Hoss Man > Fix For: master, 6.1 > > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > This issue adds a new {{TolerantUpdateProcessorFactory}} making it possible > to configure solr updates so that they are "tolerant" of individual errors in > an update request... > {code} > > 10 > > {code} > When a chain with this processor is used, but maxErrors isn't exceeded, > here's what the response looks like... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=-1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":-1, > "status":0, > "QTime":1}} > {code} > Note in the above example that: > * maxErrors can be overridden on a per-request basis > * an effective {{maxErrors==-1}} (either from config, or request param) means > "unlimited" (under the covers it's using {{Integer.MAX_VALUE}}) > If/When maxErrors is reached for a request, then the _first_ exception that > the processor caught is propagated back to the user, and metadata is set on > that exception with all of the same details about all the tolerated errors. > This next example is the same as the previous except that instead of > {{maxErrors=-1}} the request param is now {{maxErrors=1}}... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":1, > "status":400, > "QTime":1}, > "error":{ > "metadata":[ > "org.apache.solr.common.ToleratedUpdateError--ADD:1","ERROR: [doc=1] > Error adding field 'foo_i'='bogus' msg=For input string: \"bogus\"", > > "org.apache.solr.common.ToleratedUpdateError--DELQ:malformed:[","org.apache.solr.search.SyntaxError: > Cannot parse 'malformed:[': Encountered \"\" at line 1, column 11.\nWas > expecting one of:\n ...\n ...\n", > "error-class","org.apache.solr.common.SolrException", > "root-error-class","java.lang.NumberFormatException"], > "msg":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For input > string: \"bogus\"", > "code":400}} > {code} > ...the added exception metadata ensures that even in client code like the > various SolrJ SolrClient implementations, which throw a (client side) > exception on non-200 responses, the end user can access
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212415#comment-15212415 ] ASF subversion and git services commented on SOLR-445: -- Commit 5b6eacb80bca5815059cd50a1646fa4ecb146e43 in lucene-solr's branch refs/heads/branch_6x from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5b6eacb ] SOLR-445: new ToleranteUpdateProcessorFactory to support skipping update commands that cause failures when sending multiple updates in a single request. SOLR-8890: New static method in DistributedUpdateProcessorFactory to allow UpdateProcessorFactories to indicate request params that should be forwarded when DUP distributes updates. This commit is a squashed merge from the jira/SOLR-445 branch (as of b08c284b26b1779d03693a45e219db89839461d0) > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Reporter: Will Johnson >Assignee: Hoss Man > Fix For: master, 6.1 > > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > This issue adds a new {{TolerantUpdateProcessorFactory}} making it possible > to configure solr updates so that they are "tolerant" of individual errors in > an update request... > {code} > > 10 > > {code} > When a chain with this processor is used, but maxErrors isn't exceeded, > here's what the response looks like... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=-1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":-1, > "status":0, > "QTime":1}} > {code} > Note in the above example that: > * maxErrors can be overridden on a per-request basis > * an effective {{maxErrors==-1}} (either from config, or request param) means > "unlimited" (under the covers it's using {{Integer.MAX_VALUE}}) > If/When maxErrors is reached for a request, then the _first_ exception that > the processor caught is propagated back to the user, and metadata is set on > that exception with all of the same details about all the tolerated errors. > This next example is the same as the previous except that instead of > {{maxErrors=-1}} the request param is now {{maxErrors=1}}... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":1, > "status":400, > "QTime":1}, > "error":{ > "metadata":[ > "org.apache.solr.common.ToleratedUpdateError--ADD:1","ERROR: [doc=1] > Error adding field 'foo_i'='bogus' msg=For input string: \"bogus\"", > > "org.apache.solr.common.ToleratedUpdateError--DELQ:malformed:[","org.apache.solr.search.SyntaxError: > Cannot parse 'malformed:[': Encountered \"\" at line 1, column 11.\nWas > expecting one of:\n ...\n ...\n", > "error-class","org.apache.solr.common.SolrException", > "root-error-class","java.lang.NumberFormatException"], > "msg":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For input > string: \"bogus\"", > "code":400}} > {code} > ...the added exception metadata ensures that even in client code like the > various SolrJ SolrClient implementations, which throw a (client side) > exception on non-200 responses, the end user can access
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212181#comment-15212181 ] ASF subversion and git services commented on SOLR-445: -- Commit f051f56be96b12f1f3e35978ca4c840ae06a801f in lucene-solr's branch refs/heads/master from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f051f56 ] SOLR-445: new ToleranteUpdateProcessorFactory to support skipping update commands that cause failures when sending multiple updates in a single request. SOLR-8890: New static method in DistributedUpdateProcessorFactory to allow UpdateProcessorFactories to indicate request params that should be forwarded when DUP distributes updates. This commit is a squashed merge from the jira/SOLR-445 branch (as of b08c284b26b1779d03693a45e219db89839461d0) > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Reporter: Will Johnson >Assignee: Hoss Man > Fix For: master, 6.1 > > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > This issue adds a new {{TolerantUpdateProcessorFactory}} making it possible > to configure solr updates so that they are "tolerant" of individual errors in > an update request... > {code} > > 10 > > {code} > When a chain with this processor is used, but maxErrors isn't exceeded, > here's what the response looks like... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=-1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":-1, > "status":0, > "QTime":1}} > {code} > Note in the above example that: > * maxErrors can be overridden on a per-request basis > * an effective {{maxErrors==-1}} (either from config, or request param) means > "unlimited" (under the covers it's using {{Integer.MAX_VALUE}}) > If/When maxErrors is reached for a request, then the _first_ exception that > the processor caught is propagated back to the user, and metadata is set on > that exception with all of the same details about all the tolerated errors. > This next example is the same as the previous except that instead of > {{maxErrors=-1}} the request param is now {{maxErrors=1}}... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":1, > "status":400, > "QTime":1}, > "error":{ > "metadata":[ > "org.apache.solr.common.ToleratedUpdateError--ADD:1","ERROR: [doc=1] > Error adding field 'foo_i'='bogus' msg=For input string: \"bogus\"", > > "org.apache.solr.common.ToleratedUpdateError--DELQ:malformed:[","org.apache.solr.search.SyntaxError: > Cannot parse 'malformed:[': Encountered \"\" at line 1, column 11.\nWas > expecting one of:\n ...\n ...\n", > "error-class","org.apache.solr.common.SolrException", > "root-error-class","java.lang.NumberFormatException"], > "msg":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For input > string: \"bogus\"", > "code":400}} > {code} > ...the added exception metadata ensures that even in client code like the > various SolrJ SolrClient implementations, which throw a (client side) > exception on non-200 responses, the end user can access info
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212179#comment-15212179 ] ASF subversion and git services commented on SOLR-445: -- Commit f051f56be96b12f1f3e35978ca4c840ae06a801f in lucene-solr's branch refs/heads/master from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f051f56 ] SOLR-445: new ToleranteUpdateProcessorFactory to support skipping update commands that cause failures when sending multiple updates in a single request. SOLR-8890: New static method in DistributedUpdateProcessorFactory to allow UpdateProcessorFactories to indicate request params that should be forwarded when DUP distributes updates. This commit is a squashed merge from the jira/SOLR-445 branch (as of b08c284b26b1779d03693a45e219db89839461d0) > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Reporter: Will Johnson >Assignee: Hoss Man > Fix For: master, 6.1 > > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > This issue adds a new {{TolerantUpdateProcessorFactory}} making it possible > to configure solr updates so that they are "tolerant" of individual errors in > an update request... > {code} > > 10 > > {code} > When a chain with this processor is used, but maxErrors isn't exceeded, > here's what the response looks like... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=-1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":-1, > "status":0, > "QTime":1}} > {code} > Note in the above example that: > * maxErrors can be overridden on a per-request basis > * an effective {{maxErrors==-1}} (either from config, or request param) means > "unlimited" (under the covers it's using {{Integer.MAX_VALUE}}) > If/When maxErrors is reached for a request, then the _first_ exception that > the processor caught is propagated back to the user, and metadata is set on > that exception with all of the same details about all the tolerated errors. > This next example is the same as the previous except that instead of > {{maxErrors=-1}} the request param is now {{maxErrors=1}}... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":1, > "status":400, > "QTime":1}, > "error":{ > "metadata":[ > "org.apache.solr.common.ToleratedUpdateError--ADD:1","ERROR: [doc=1] > Error adding field 'foo_i'='bogus' msg=For input string: \"bogus\"", > > "org.apache.solr.common.ToleratedUpdateError--DELQ:malformed:[","org.apache.solr.search.SyntaxError: > Cannot parse 'malformed:[': Encountered \"\" at line 1, column 11.\nWas > expecting one of:\n ...\n ...\n", > "error-class","org.apache.solr.common.SolrException", > "root-error-class","java.lang.NumberFormatException"], > "msg":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For input > string: \"bogus\"", > "code":400}} > {code} > ...the added exception metadata ensures that even in client code like the > various SolrJ SolrClient implementations, which throw a (client side) > exception on non-200 responses, the end user can access info
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210827#comment-15210827 ] Timothy Potter commented on SOLR-445: - LGTM +1 Nice test coverage of all this! This will be very useful for streaming applications (such as from Spark and Storm) where re-trying individual docs after an error is less than ideal. Now we'll be able to pin-point exactly which docs had issues! I'd prefer this to be baked into the default chain but can understand the rationale for leaving it out for now too. So long as we put up an example of how to enable it using the Config API in the ref guide. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Reporter: Will Johnson >Assignee: Hoss Man > Fix For: master, 6.1 > > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > This issue adds a new {{TolerantUpdateProcessorFactory}} making it possible > to configure solr updates so that they are "tolerant" of individual errors in > an update request... > {code} > > 10 > > {code} > When a chain with this processor is used, but maxErrors isn't exceeded, > here's what the response looks like... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=-1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":-1, > "status":0, > "QTime":1}} > {code} > Note in the above example that: > * maxErrors can be overridden on a per-request basis > * an effective {{maxErrors==-1}} (either from config, or request param) means > "unlimited" (under the covers it's using {{Integer.MAX_VALUE}}) > If/When maxErrors is reached for a request, then the _first_ exception that > the processor caught is propagated back to the user, and metadata is set on > that exception with all of the same details about all the tolerated errors. > This next example is the same as the previous except that instead of > {{maxErrors=-1}} the request param is now {{maxErrors=1}}... > {code} > $ curl > 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=1' > -H "Content-Type: application/json" --data-binary '{"add" : { > "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' > { > "responseHeader":{ > "errors":[{ > "type":"ADD", > "id":"1", > "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For > input string: \"bogus\""}, > { > "type":"DELQ", > "id":"malformed:[", > "message":"org.apache.solr.search.SyntaxError: Cannot parse > 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one > of:\n ...\n ...\n"}], > "maxErrors":1, > "status":400, > "QTime":1}, > "error":{ > "metadata":[ > "org.apache.solr.common.ToleratedUpdateError--ADD:1","ERROR: [doc=1] > Error adding field 'foo_i'='bogus' msg=For input string: \"bogus\"", > > "org.apache.solr.common.ToleratedUpdateError--DELQ:malformed:[","org.apache.solr.search.SyntaxError: > Cannot parse 'malformed:[': Encountered \"\" at line 1, column 11.\nWas > expecting one of:\n ...\n ...\n", > "error-class","org.apache.solr.common.SolrException", > "root-error-class","java.lang.NumberFormatException"], > "msg":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For input > string: \"bogus\"", > "code":400}} > {code} > ...the added exception metadata ensures that even in client code like the > various SolrJ SolrClient implementations, which throw a (client side) > exception on non-200 responses, the end user can access info on all the > tolerated errors that were ignored before the maxErrors threshold was reached. > > {panel:title=Original Jira Request} > Has anyone run into the problem of handling bad documents / failures mid >
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210784#comment-15210784 ] Hoss Man commented on SOLR-445: --- I'm still beasting the tests a bit, but i think this is pretty solid and ready for master/branch_6x > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210781#comment-15210781 ] ASF subversion and git services commented on SOLR-445: -- Commit b08c284b26b1779d03693a45e219db89839461d0 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=b08c284 ] SOLR-445: fix logger declaration to satisfy precommit > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210779#comment-15210779 ] ASF subversion and git services commented on SOLR-445: -- Commit 39884c0b0c02b4090640d6268a45a1cf5f54f3e0 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=39884c0 ] SOLR-445: removing questionable isLeader check; beasting the tests w/o this code didn't demonstrate any problems > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210780#comment-15210780 ] ASF subversion and git services commented on SOLR-445: -- Commit 1d8cdd27993a46ae17c4ac308504513a33f01a15 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1d8cdd2 ] SOLR-445: remove test - we have more complete coverage in TestTolerantUpdateProcessorCloud which uses the more robust SolrCloudTestCase model > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209462#comment-15209462 ] ASF subversion and git services commented on SOLR-445: -- Commit 956d9a592a0a6e9c9d7c8244a4289f4cbf5d5012 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=956d9a5 ] SOLR-445: more testing of DBQ mixed with failures (trying to staticly recreate a random failure i haven't fully figured out yet) > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209465#comment-15209465 ] ASF subversion and git services commented on SOLR-445: -- Commit a4686553712a0d01dc2d6853038c4cca2caee63f in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a468655 ] SOLR-445: randomized testing of the 'doc missing unique key' code path > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209467#comment-15209467 ] ASF subversion and git services commented on SOLR-445: -- Commit da3ea40e80189c7c2bbd8114a99c72a64262786b in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=da3ea40 ] SOLR-8890: generalized whitelist of param names DUP will use when forwarding requests, usage in SOLR-445 > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209463#comment-15209463 ] ASF subversion and git services commented on SOLR-445: -- Commit 2622eac2915ee210cfffd1969ef5dd8e2030e5cf in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=2622eac ] SOLR-445: harden checks in random test; add isoluated cloud test demonstrating bug random test found; add nocommit hack to DUP to work around test failure for now (SOLR-8890 to fix a better way) > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209461#comment-15209461 ] ASF subversion and git services commented on SOLR-445: -- Commit ae22181193dcb24707f7255f0132a2a0a85bf300 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ae22181 ] SOLR-445: fix silly test bug > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206749#comment-15206749 ] Hoss Man commented on SOLR-445: --- bq. What is the impact of many docs failing due to missing ID? Is there a test for that? I couldn't find one, but the diff is pretty big, I may have missed stuff. good question -- there were checks of this in TolerantUpdateProcessorTest (from the early days of this patch) but i added some to TestTolerantUpdateProcessorCloud which uncovered a bug (now fixed) when checking isLeader -- see: cc2cd23ca2537324dc7e4afe6a29605bbf9f1cb8 bq. Don't know the answer to the "isLeader" question. I'd say the request would fail if leader changes in the middle of a request, but I'm not sure. Hmm... can you explain more what you think/expect could go wrong with the isLeader code removed that wouldn't go wrong with the code as it is today? I mean ... theoretically, even with the isLeader check as we have it right now, the leader could change between the time we do the isLeader check and the call to super.processAdd (where DUP will do it's own isLeader check) ... or it could change (again) between the time super.processAdd/DUP.processAdd throws an exception and the time we make a decision wetherto only track it or track and immediately re-throw. I'm just not sure if that added code is really gaining us anything useful -- but if someone can help me understand (or better still: demonstrate with a test) a concrete situation where the current code does the correct thing, but removeing the isLeader check is broken then i'll be convinced. Where things currently stand: * The only remaining nocommits on the branch are questions about deleting the isLeader code, and questions about deleting DistribTolerantUpdateProcessorTest since we have other more robust cloud tests now. * Even with the "retry after giving serachers time to reopen" logic in TestTolerantUpdateProcessorRandomCloud, i'm seeing a failure that reproduces consistently for me...{noformat} [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestTolerantUpdateProcessorRandomCloud -Dtests.method=testRandomUpdates -Dtests.seed=ECFD2B9118A542E7 -Dtests.slow=true -Dtests.locale=bg -Dtests.timezone=Asia/Taipei -Dtests.asserts=true -Dtests.file.encoding=UTF-8 [junit4] FAILURE 6.00s | TestTolerantUpdateProcessorRandomCloud.testRandomUpdates <<< [junit4]> Throwable #1: java.lang.AssertionError: cloud client doc count doesn't match bitself cardinality expected:<22> but was:<23> {noformat}...so i'm currently working to improve the logging and trace through the test to understand that. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206743#comment-15206743 ] ASF subversion and git services commented on SOLR-445: -- Commit 5d93384e724b6f611270e212a4f9bd5b00c38e85 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5d93384 ] SOLR-445: fix exception msg when CloudSolrClient does async updates that (cumulatively) exceed maxErrors I initially thought it would make sense to refactor DistributedUpdatesAsyncException into solr-common and re-use it here, but when i started down that path i realized it didn't make any sense since there aren't actual exceptions to wrap client side. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206741#comment-15206741 ] ASF subversion and git services commented on SOLR-445: -- Commit fe54da0b58ed18a38f3dd436dd3f30fbee9acbbf in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=fe54da0 ] SOLR-445: remove nocommits related to OOM trapping since SOLR-8539 has concluded that this isn't a thing the java code actually needs to be defensive of > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206746#comment-15206746 ] ASF subversion and git services commented on SOLR-445: -- Commit cc2cd23ca2537324dc7e4afe6a29605bbf9f1cb8 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=cc2cd23 ] SOLR-445: cloud test & bug fix for docs missing their uniqueKey field > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205787#comment-15205787 ] Tomás Fernández Löbbe commented on SOLR-445: Looks really good to me. What is the impact of many docs failing due to missing ID? Is there a test for that? I couldn't find one, but the diff is pretty big, I may have missed stuff. Don't know the answer to the "isLeader" question. I'd say the request would fail if leader changes in the middle of a request, but I'm not sure. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205503#comment-15205503 ] Anshum Gupta commented on SOLR-445: --- Thanks Hoss. I didn't look at the recent commits but from whatever I reviewed, this looks good. A bunch of nocommits but good stuff overall. I'll try and pitch in. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203598#comment-15203598 ] ASF subversion and git services commented on SOLR-445: -- Commit 6ec8c635bf5853dfb229f89cb2818749c1cfe8ce in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=6ec8c63 ] SOLR-445: cleanup some simple nocommits > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203599#comment-15203599 ] ASF subversion and git services commented on SOLR-445: -- Commit 21c0fe690dc4e968e484ee906632a50bf0273786 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=21c0fe6 ] SOLR-445: hardent the ToleratedUpdateError API to hide implementation details > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203594#comment-15203594 ] ASF subversion and git services commented on SOLR-445: -- Commit aeda8dc4ae881c4ec405d70dcbf1d0b2c30871b7 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=aeda8dc ] SOLR-445: fix test bugs, and put in a stupid work around for SOLR-8862 > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203592#comment-15203592 ] ASF subversion and git services commented on SOLR-445: -- Commit 8cc0a38453b389bdb031d78ad638b76dfa27f2d5 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8cc0a38 ] SOLR-445: Merge branch 'master' into jira/SOLR-445 > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203596#comment-15203596 ] ASF subversion and git services commented on SOLR-445: -- Commit 1aa1ba3b3af69cad65b7a411ca88e120a418a598 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1aa1ba3 ] SOLR-445: harden & add logging to test also rename since chaos monkey isn't going to be involved (due to SOLR-8872) > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203593#comment-15203593 ] ASF subversion and git services commented on SOLR-445: -- Commit 8cc0a38453b389bdb031d78ad638b76dfa27f2d5 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8cc0a38 ] SOLR-445: Merge branch 'master' into jira/SOLR-445 > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198396#comment-15198396 ] ASF subversion and git services commented on SOLR-445: -- Commit a0d48f873c21ca0ab5ba02748c1659a983aad886 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a0d48f8 ] SOLR-445: start of a new randomized/chaosmonkey test, currently blocked by SOLR-8862 (no monkey yet) > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196063#comment-15196063 ] Hoss Man commented on SOLR-445: --- {{git diff master...jira/SOLR-445}} should be what you are looking for. (note: three dots) if you want to review hte list of individual commits, that's {{git log master..jira/SOLR-445}} (note: two dots) (using two docs with diff, or three dots with log give you totally different and exciting and confusing behavior) > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196019#comment-15196019 ] Anshum Gupta commented on SOLR-445: --- [~hossman] what's the best way to look at the diff here? I tried the following but this gives a ton of unrelated stuff, which I assume is because this branch isn't up to date with the current master {code} git diff master {code} I also tried this after [~elyograg] suggested it on irc, but it's the same result: {code} # after checking out and doing a pull for both, master and jira/SOLR-445 git diff refs/heads/master refs/heads/jira/SOLR-445 {code} > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195859#comment-15195859 ] ASF subversion and git services commented on SOLR-445: -- Commit 116ffaec6f6680c7312cb87680f8463df862f1f0 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=116ffae ] SOLR-445: test failures from adds mixed with deletes > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194699#comment-15194699 ] David Smiley commented on SOLR-445: --- The usage you explained looks nice Hoss! This will be very useful. I assume there are SolrJ hooks here. I'll leave the internal review to others. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194368#comment-15194368 ] Anshum Gupta commented on SOLR-445: --- Thanks Hoss! I'll take a look at this tonight or tomorrow morning. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194219#comment-15194219 ] Timothy Potter commented on SOLR-445: - digging into this now, thanks [~hossman] > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193606#comment-15193606 ] Hoss Man commented on SOLR-445: --- (ment to post last friday but was blocked by the jira outage) Ok ... i think things are looking pretty good on the jira/SOLR-445 branch -- good enough that I'd really like some help reviewing the code & sanity checking the API (and internals for anyone who is up for it)... For folks who haven't been following closely, here's what the configuration looks like (from the javadocs)... {code} 10 {code} When a chain with this processor is used, but maxErrors isn't exceeded, here's what the response looks like... {code} $ curl 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=-1' -H "Content-Type: application/json" --data-binary '{"add" : { "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' { "responseHeader":{ "errors":[{ "type":"ADD", "id":"1", "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For input string: \"bogus\""}, { "type":"DELQ", "id":"malformed:[", "message":"org.apache.solr.search.SyntaxError: Cannot parse 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one of:\n ...\n ...\n"}], "maxErrors":-1, "status":0, "QTime":1}} {code} Note in the above example that: * maxErrors can be overridden on a per-request basis * an effective {{maxErrors==-1}} (either from config, or request param) means "unlimited" (under the covers it's using {{Integer.MAX_VALUE}}) If/When maxErrors is reached for a request, then the _first_ exception that the processor caught is propagated back to the user, and metadata is set on that exception with all of the same details about all the tolerated errors. This next example is the same as the previous except that instead of {{maxErrors=-1}} the request param is now {{maxErrors=1}}... {code} $ curl 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain=json=true=1' -H "Content-Type: application/json" --data-binary '{"add" : { "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}' { "responseHeader":{ "errors":[{ "type":"ADD", "id":"1", "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For input string: \"bogus\""}, { "type":"DELQ", "id":"malformed:[", "message":"org.apache.solr.search.SyntaxError: Cannot parse 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one of:\n ...\n ...\n"}], "maxErrors":1, "status":400, "QTime":1}, "error":{ "metadata":[ "org.apache.solr.common.ToleratedUpdateError--ADD:1","ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For input string: \"bogus\"", "org.apache.solr.common.ToleratedUpdateError--DELQ:malformed:[","org.apache.solr.search.SyntaxError: Cannot parse 'malformed:[': Encountered \"\" at line 1, column 11.\nWas expecting one of:\n ...\n ...\n", "error-class","org.apache.solr.common.SolrException", "root-error-class","java.lang.NumberFormatException"], "msg":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For input string: \"bogus\"", "code":400}} {code} ...the added exception metadata ensures that even in client code like the various SolrJ SolrClient implementations, which throw a (client side) exception on non-200 responses, the end user can access info on all the tolerated errors that were ignored before the maxErrors threshold was reached. CloudSolrClient in particular -- which already has logic to split {{UpdateRequests}}; route individual commands to the appropraite leaders; and merge the responses -- has been updated to handle merging these responses as well. (The {{ToleratedUpdateError}} class for modeling these types of errors has been added to solr-common, and has static utilities that client code can use to parse the data out of the responseHeader or out of any client side SolrException metadata) There are still a bunch of {{nocommit}} comments, but they are almost all related to either: * adding tests * adding docs * refactoring / hardening some internal APIs * removing suspected unneccessary "isLeader" code (once tests are final) I'll keep working on those, but I'd appreciate feedback from folks on how things currently stand. Even if you don't understand/care about the internals, thoughts on the user facing API would be appreciated. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174038#comment-15174038 ] ASF subversion and git services commented on SOLR-445: -- Commit 2401c9495319e1b5065b05ef3a36ee586f06b6d4 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=2401c94 ] SOLR-445 Merge branch 'master' into jira/SOLR-445 (pick up SOLR-8738 changes) > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174039#comment-15174039 ] ASF subversion and git services commented on SOLR-445: -- Commit 2401c9495319e1b5065b05ef3a36ee586f06b6d4 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=2401c94 ] SOLR-445 Merge branch 'master' into jira/SOLR-445 (pick up SOLR-8738 changes) > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166553#comment-15166553 ] ASF subversion and git services commented on SOLR-445: -- Commit 98e8c344b81a169f20b09742e48f423f533837f6 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=98e8c34 ] SOLR-445: no need to track maxErrors explicitly, that's what errors.size() is for > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166555#comment-15166555 ] ASF subversion and git services commented on SOLR-445: -- Commit 4ce376fa0f1e6acc84744582ddba6dfe9fd6f11a in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4ce376f ] SOLR-445: replace errors map with List and tweak public so we can differentiate errors of diff types for example: an error on deleteById for docId1 vs an error on add for docId1 > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166554#comment-15166554 ] ASF subversion and git services commented on SOLR-445: -- Commit 08bcb769bd1e896e719ebb0b4512208c993d9c38 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=08bcb76 ] SOLR-445: refactor metadata key+val parsing/formatting to use a new static inner helper class (KnownErr) > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166551#comment-15166551 ] ASF subversion and git services commented on SOLR-445: -- Commit 2e5c5b022ed2f185b95c745ccdbd0a142e3b5794 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=2e5c5b0 ] SOLR-445: prune out bogus (ie: technically infeasible to be accurate) 'numAdds' code > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166552#comment-15166552 ] ASF subversion and git services commented on SOLR-445: -- Commit d5fd11999e17f52939ff0a2dd54572f0a95431ca in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d5fd119 ] SOLR-445: loop over all distributed errors > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166549#comment-15166549 ] ASF subversion and git services commented on SOLR-445: -- Commit bc5dfeeff1a182630fc3b55be3cf2f4fe164d446 in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=bc5dfee ] SOLR-445: play nice with SOLR-8674 test changes > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166548#comment-15166548 ] ASF subversion and git services commented on SOLR-445: -- Commit a58ad2a6b11077a24040810b9c6b6d84f15b055d in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a58ad2a ] Merge branch 'master' into jira/SOLR-445 > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166556#comment-15166556 ] ASF subversion and git services commented on SOLR-445: -- Commit 0f0571928012c071c62fb928d2142d21fa183e2b in lucene-solr's branch refs/heads/jira/SOLR-445 from [~hossman_luc...@fucit.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=0f05719 ] SOLR-445: basic support and tests for being tolerant of deletion failures the tests that send deleteByQuerys to non shard leaders are failing because the resulting error details are null, need to re-review the DBQ logic in DUP to figure out where these failures are getting lost (see testVariousDeletesViaNoCollectionClient, testVariousDeletesViaShard1NonLeaderClient, testVariousDeletesViaShard2NonLeaderClient) > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15155031#comment-15155031 ] Hoss Man commented on SOLR-445: --- bq. Huh? What does SOLR-8633 have to do with calling setException? Sorry, nothing ... it's been a while since i looked at the specifics of that code and i spaced out on what we were actually talking about. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154793#comment-15154793 ] Mark Miller commented on SOLR-445: -- Let's not get too pedantic about adding comments to help future devs avoid bad decisions when we find bad decisions. Easier to just add the comment and make the code base a little easier to understand (which I've taken a stab at in the above branch). > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154775#comment-15154775 ] Mark Miller commented on SOLR-445: -- Huh? What does SOLR-8633 have to do with calling setException? I'd say it fits right here. Here is where it's talked about, here is where it's changed in a patch... > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154647#comment-15154647 ] Hoss Man commented on SOLR-445: --- bq. and the comment I mentioned above. I still don't understand why that's part of _this_ issue and not SOLR-8633 where the actual bad behavior needs fixed (and should be done indepenetnly, and prior to, trying to move forward here) > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154641#comment-15154641 ] Mark Miller commented on SOLR-445: -- Pushed a branch of this patch with some minor cleanup I needed to remove all Eclipse errors and the comment I mentioned above. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154637#comment-15154637 ] ASF subversion and git services commented on SOLR-445: -- Commit 3a9da7ae576f35e742ec54a72da2d4224066bb63 in lucene-solr's branch refs/heads/jira/SOLR-445 from markrmiller [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3a9da7a ] SOLR-445: hossman's feb 5 2016 patch > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154638#comment-15154638 ] ASF subversion and git services commented on SOLR-445: -- Commit fd12a5b9f8d6319945d4445ac31e650bd1627dfc in lucene-solr's branch refs/heads/jira/SOLR-445 from markrmiller [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=fd12a5b ] SOLR-445: some cleanup > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154609#comment-15154609 ] Mark Miller commented on SOLR-445: -- bq. I'm not sure what exactly it should say, but that seems orthogonal to the current issue Seems like part of this issue to me. As you point out and fix in this issue, we really should not be doing setException explicitly, that should really be done in one place. So you fix it here, but best way to prevent this things from creeping back in is doc. Hardly worth another issue to add a comment to the effect of what you already wrote above though. So something along the lines of: You should not generally add new calls to this method, you should throw exceptions and let the existing infra handle it. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154525#comment-15154525 ] Yonik Seeley commented on SOLR-445: --- bq. I'd also love to make this standard behavior and not some optional update processor. +1 to that... big value in having it by default. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154493#comment-15154493 ] Hoss Man commented on SOLR-445: --- bq. Can we document that on the setException method? I'm not sure what exactly it should say, but that seems orthogonal to the current issue -- feel free to add whatever you think makes sense as a distinct git commit, my point was just that the current behavior is very inconsistent with the way exceptions are normally processed in solr, and doesn't give "up stream" callers the chance to catch/handle the exception. bq. If we could get this out in a major version, I'd also love to make this standard behavior and not some optional update processor. Agreed - but for now, to minimize the invasiveness, I'd prefer to continue on the path of using a custom update processor & then later we can assess refactoring the code to make this a more integrated feature and change that update processor to a No-Op. (particularly since there is going to be some overhead in counting/tracking the errors) bq. Oh wait a minute, are you only doing that when maxErrors is exceeded? In that case failing the request makes sense to me I guess. yeah, that's the context of the question ... i'm leaning towards agreeing with you, particularly since (as things stand now) the caller can access the SolrJ exception metadata to see exactly what failed (but i really wish there was an easier way to access the _full_ response body in those cases) bq. Anyhow, it does seems safer to me to not break and process all the errors. Yeah, that was my thinking. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154266#comment-15154266 ] Mark Miller commented on SOLR-445: -- {code} if ("LeaderChanged".equals(cause)) { // let's just fail this request and let the client retry? or just call processAdd again? log.error("On "+cloudDesc.getCoreNodeName()+", replica "+replicaUrl+ " now thinks it is the leader! Failing the request to let the client retry! "+error.e); errorsForClient.add(error); break; // nocommit: why not continue? } {code} Been thinking about this a little. Perhaps the idea is, we know the error is not from a forward request, we skip those here. Which means they must all be leader to replica and if the leader changed, we can't put anyone in LIR anyway. Anyhow, it does seems safer to me to not break and process all the errors. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154253#comment-15154253 ] Mark Miller commented on SOLR-445: -- {quote} // nocommit: should this really be a top level exception? // nocommit: or should it be an HTTP:200 with the details of what faild in the body?{quote} For the Collections API I went with HTTP:200. The overall request to the server succeeded, here are the individual update fails. I guess I lean that way a bit. If 1 update out 1000 fails, it seems kind of strange to fail the whole request with this new code. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154226#comment-15154226 ] Mark Miller commented on SOLR-445: -- I have no issue with punting numAdds on this. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154220#comment-15154220 ] Mark Miller commented on SOLR-445: -- bq. One notable change her is that i switched DUP.finish() from directly calling SOlrQueryResponse.setException() and instead made it throw the exception. Independent of this issue, the existing behavior seems like a bug / bad-form – what if the caller already caught some earlier exception it wants to return and finish() is just being called in finally? Can we document that on the setException method? bq. maxErrors Yeah, works for the end user. Internally, if we had a good way to track all the fails in some efficient manner (we learn about them as they happen or something), we could perhaps use a single ConcurrentUpdateSolrClient per replica and be much more connection efficient. Kind of beyond this issue, but my interest in this issue is that it seems to be the start of that path. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154210#comment-15154210 ] Mark Miller commented on SOLR-445: -- Wow, I've never seen this old beast issue. Goodness, with all this work if we could just fix the error handling too for 6.0... If we had proper error responses, oh the things we could do. I don't know how you handle when 100,000 docs in your stream fail though. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Hoss Man > Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445-alternative.patch, > SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633103#comment-14633103 ] Anshum Gupta commented on SOLR-445: --- I'm seeing a few errors with the current patch and I think I know what's going on. I'll take a look at it and update the patch tomorrow. Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Assignee: Anshum Gupta Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634464#comment-14634464 ] Noble Paul commented on SOLR-445: - I guess it would be better if we return the whole command instead of just the id to the user Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Assignee: Anshum Gupta Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14269854#comment-14269854 ] Hoss Man commented on SOLR-445: --- bq. It works only in the case of the update arriving to the shard leader (as it would fail while adding the doc locally), but if the update needs to be forwarded to the leader, then it will not work. ...i'm not sure if this will solve all of the problems Tomas ran into, but one thing that might help (and was added after the latest version of hte patch was written) is the UpdateRequestProcessorFactory.RunAlways marker interface. it gives UpdateProcessorFactories a mechanism to say they want to be run as part of hte chain even if the update.distrib logic would normally skip them for already being run on a previous node (ie: the update has already been forwarded once) so that interface, combined with some basic checks of am i the leader? could allow this processor to ensure it was always/only executing some bits of logic on the leader. (there might still be some problems however in terms of accurately responding/reporting aggregate failures when batch updates involve docs that go to differnet leaders) Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Fix For: 4.9, Trunk Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113986#comment-14113986 ] Tomás Fernández Löbbe commented on SOLR-445: [~shidunce] Sorry I missed your comment. I understand this issue you are seeing is not with any of the patches in this Jira, right? If so, you should ask the question in the users list, you'll get much more eyes in your problem that way than posting here. Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Fix For: 4.9, 5.0 Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110647#comment-14110647 ] Denis Shishlyannikoc commented on SOLR-445: --- Question related to this JIRA. After failure to index one of documents with wrong date value (2014-03-18K18:15:13Z) solr kept this document in some queue and tried to reindex this document again (attempt per some 3-5 minutes, did not measure exact time of that), showing same (failed to parse date) exception in logs! After solr server restart issue is gone: no more tries to reindex problematic date document. Looks like not very correct actions. How can it be explained? How can I avoid such reindexing ? I don't care to lose some not correct documents, but I don't want solr to stuck on them after failure. Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Fix For: 4.9, 5.0 Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982423#comment-13982423 ] Tomás Fernández Löbbe commented on SOLR-445: bq. As a side note, this DistributedUpdateProcessor behavior makes it “tolerant”, but only in some cases? I have confirmed this. Depending on which node gets the initial update request and the position of the invalid doc in the batch, the docs that end up indexed will vary from 0 to all but the invalid doc. Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Fix For: 4.9, 5.0 Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974536#comment-13974536 ] Tomás Fernández Löbbe commented on SOLR-445: I uploaded a new patch with more javadocs and the test chains name changed. {quote} even if maxErrors isn't reached, we should consider carefully whether or not it makes sense to be returning a 200 status code even if every update command that's executed for a request fails. (ie: if maxErrors defaults to Integer.MAX_VALUE, and i send 100 docs and all 100 fail, should i really get a 200 status code back?) {quote} I think this would make it more confusing. Having this processor means that the client wants to manage failing docs on their side. If all the docs fail so be it, they’ll know how to manage it on their side, I don’t think that should be a special case. Plus, I think getting the 200 gives you more information, it tells you that Solr tried adding all the docs the client sent and it didn’t abort somewhere in the middle, like it would happen if you get a 4XX/5XX I was also thinking that this processor won’t work together with DistributedUpdateProcessor, it has its own error processing, plus the distribution would create multiple internal requests (chains too) right? Also, the ConcurrentUpdateSolrServer used in SolrCmdDistributor would batch docs in a non-deterministic way, right? Would be impossible to count errors at this level. Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Fix For: 4.9, 5.0 Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974667#comment-13974667 ] Hoss Man commented on SOLR-445: --- bq. I think this would make it more confusing. Having this processor means that the client wants to manage failing docs on their side. If all the docs fail so be it. Yeah, i'm not convinced you're wrong -- I just wasn't sure how i felt about it and I wanted to make we considered. Even if users configure this, they might be surprised if something like a a schema.xml mismatch with some update process they are using causes a 500 error on every individual udpate -- but still results in a 200 coming back because of this component. But I think you are right -- as long as the docs are clear that the status will _allways_ be a 200, even if all docs fail, we're fine. bq. I was also thinking that this processor won’t work together with DistributedUpdateProcessor, it has its own error processing, plus the distribution would create multiple internal requests... As long as this processor is configured before the DistributedUpdateProcessorFactory it should work fine: * when the requests get forwarded to other shards, they'll bypass this processor (and any other processors that come before DistributedUpdateProcessorFactory) so it won't break the cumulative error handling in DistributedUpdateProcessorFactory * DistributedUpdateProcessorFactory still ultimately throws only one Exception per UpdateCommand when it forwards to multiple replicas, so your new processor will still get at most 1 error to track per doc when accumulating results to return to the client but it's trivial to write a distributed version of your test case to prove that you get the results you expect -- probably a good idea to write one to help future proof this processor against unforeseen future changes in the distributed update processing Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Fix For: 4.9, 5.0 Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968996#comment-13968996 ] Hoss Man commented on SOLR-445: --- Tomas: * we need better class level javadocs for the TolerantUpdateProcessorFactory - basically everything currently in the TolerantUpdateProcessor's javadocs, plus some example configuration, plus a note about how maxErrors can be specified as a request param or as an init param and an explanation of the default behavior if maxErrors specified at all * would you mind renaming tolerant-chain1 and tolerant-chain2 with more descriptive names to make the tests easier to read? perhaps tolerate-10-failures-chain and tolerate-unlimited-failures-chain ? * even if maxErrors isn't reached, we should consider carefully whether or not it makes sense to be returning a 200 status code even if _every_ update command that's executed for a request fails. (ie: if maxErrors defaults to Integer.MAX_VALUE, and i send 100 docs and all 100 fail, should i really get a 200 status code back?) Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Fix For: 4.8 Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967726#comment-13967726 ] Tomás Fernández Löbbe commented on SOLR-445: Any more thoughts on this patch? Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Fix For: 4.8 Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956720#comment-13956720 ] Hoss Man commented on SOLR-445: --- bq. . The errors are managed by an UpdateRequestProcessor that must be added before other processors in the chain. Off the cuff: this sounds like a great idea. The on piece of feedback that occurred to me though would be to tweak the response format so that there is a 1-to-1 correspondence of documents in the initial request to statuses in the response -- even if the schema doesn't use uniqueKey... {code} lst name=responseHeader int name=numErrors10/int lst name=results !-- if schema has uniqueKeys, they are the names of the response -- lst name=42 / !-- success so empty -- lst name=1 !-- 2nd doc in update, with uniqueKey of 1 had this failure -- str name=messageERROR: [doc=1] Error adding field 'weight'='b' msg=For input string: b/str /lst lst name=60 / !-- success so empty -- lst name=3 !-- 4th doc in update, with uniqueKey of 3 had this failure -- str name=messageERROR: [doc=3] Error adding field 'weight'='b' msg=For input string: b/str /lst ... int name=status0/int int name=QTime17/int /lst {code} ? Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Fix For: 4.8 Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956748#comment-13956748 ] Yonik Seeley commented on SOLR-445: --- bq. even if the schema doesn't use uniqueKey... That would lead to some huge responses. I think instead the notion of not having a uniqueKey should essentially be deprecated. Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Fix For: 4.8 Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956767#comment-13956767 ] Shalin Shekhar Mangar commented on SOLR-445: bq. That would lead to some huge responses. I think instead the notion of not having a uniqueKey should essentially be deprecated. +1 Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Fix For: 4.8 Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956790#comment-13956790 ] Tomás Fernández Löbbe commented on SOLR-445: bq. I think instead the notion of not having a uniqueKey should essentially be deprecated. +1 bq. That would lead to some huge responses. Do you mean including the ids of good docs in the response too? I don't think that would be that big. Should be much smaller than the request Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Fix For: 4.8 Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956798#comment-13956798 ] Yonik Seeley commented on SOLR-445: --- bq. Do you mean including the ids of good docs in the response too? I don't think that would be that big. Should be much smaller than the request Some people (including myself) send/load millions of docs per request - it's very unfriendly to get back megabytes of responses unless you explicitly ask. If this processor is not in the default chain, then I guess it doesn't matter much. But I could see adding this ability by default (regardless of if it's a separate processor or not) via a parameter like maxErrors or something. Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Fix For: 4.8 Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956848#comment-13956848 ] Tomás Fernández Löbbe commented on SOLR-445: I see. Maybe I could add then just the numSucceed just as a confirmation that the rest of the docs made it in? Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Fix For: 4.8 Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291055#comment-13291055 ] Yonik Seeley commented on SOLR-445: --- I imagine a maxErrors parameter might be useful (and more readable than abortOnFirstBatchIndexError) maxErrors=0 (the current behavior - stop processing more updates when we hit an error) maxErrors=10 (allow up to 10 documents to fail before aborting the update... useful for true bulk uploading where you want to allow for an isolated failure or two, but still want to stop if every single update is failing because something is configured wrong) maxErrors=-1 (allow an unlimited number of documents to fail) Making updates transactional seems really tough in cloud mode since we don't keep old versions of documents around... although it might be possible for a short time with the transaction log. Anyway, that should definitely be a separate issue. A couple of other notes: - structured error responses were recently added in 4.0-dev that should make this issue easier in general. Example: {code} {responseHeader:{status:400,QTime:0},error:{msg:ERROR: [doc=mydoc] unknown field 'foo',code:400}} {code} - Per did some error handling work that's included in a patch attached to SOLR-3178 Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.3 Reporter: Will Johnson Fix For: 4.1 Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233357#comment-13233357 ] Erick Erickson commented on SOLR-445: - Well, it's clear I won't get to this in the 3.6 time frame, so if someone else wants to pick it up feel free. However, I also wonder whether with 4.0 and SolrCloud we have to approach this differently to accomodate how documents are passed around there? Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Bug Components: update Affects Versions: 1.3 Reporter: Will Johnson Assignee: Erick Erickson Fix For: 4.0 Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13042813#comment-13042813 ] Grant Ingersoll commented on SOLR-445: -- Erick, feel free to take this one and iterate as you see fit Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Bug Components: update Affects Versions: 1.3 Reporter: Will Johnson Assignee: Grant Ingersoll Fix For: 3.2 Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027545#comment-13027545 ] Shinichiro Abe commented on SOLR-445: - In Solr Cell, There is the same problem. It aborts mid during posting the protected files(SOLR-2480). I hope that update handlers should be fixed by applying that model. Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Bug Components: update Affects Versions: 1.3 Reporter: Will Johnson Assignee: Grant Ingersoll Fix For: Next Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027245#comment-13027245 ] Lance Norskog commented on SOLR-445: If the DIH semantics cover all of the use cases, please follow that model: behavior, names, etc. It will be much easier on developers. Update Handlers abort with bad documents Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Bug Components: update Affects Versions: 1.3 Reporter: Will Johnson Assignee: Grant Ingersoll Fix For: Next Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org