[
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anshum Gupta updated SOLR-445:
------------------------------
Attachment: SOLR-445.patch
Updated patch. I'm still working on getting the correct error count for the
following case:
Client -> Node1 -> Node2 (Leader for shard 1) -> Node3 (Non-leader Replica for
shard 1)
In the above case, Node2 as of now return an HTTP OK and doesn't throw an
exception, the StreamingSolrClient used but the Distributed Updated Processor
doesn't realize the error that was consumed by the leader of shard 1.
I'm making progress on that but have run into deadend with a few of the
approaches I've tried so far.
The TolerantUpdateProcessor as of now kicks in when
1. It's the leader node or,
2. It's the node that receives the request from the client i.e. yet to forward
the request to the leader shard.
> Update Handlers abort with bad documents
> ----------------------------------------
>
> Key: SOLR-445
> URL: https://issues.apache.org/jira/browse/SOLR-445
> Project: Solr
> Issue Type: Improvement
> Components: update
> Affects Versions: 1.3
> Reporter: Will Johnson
> Assignee: Anshum Gupta
> Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch,
> SOLR-445-alternative.patch, SOLR-445-alternative.patch,
> SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch,
> SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch,
> SOLR-445_3x.patch, solr-445.xml
>
>
> Has anyone run into the problem of handling bad documents / failures mid
> batch. Ie:
> <add>
> <doc>
> <field name="id">1</field>
> </doc>
> <doc>
> <field name="id">2</field>
> <field name="myDateField">I_AM_A_BAD_DATE</field>
> </doc>
> <doc>
> <field name="id">3</field>
> </doc>
> </add>
> Right now solr adds the first doc and then aborts. It would seem like it
> should either fail the entire batch or log a message/return a code and then
> continue on to add doc 3. Option 1 would seem to be much harder to
> accomplish and possibly require more memory while Option 2 would require more
> information to come back from the API. I'm about to dig into this but I
> thought I'd ask to see if anyone had any suggestions, thoughts or comments.
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]