[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents

Mark Miller (JIRA) Fri, 19 Feb 2016 05:45:06 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154220#comment-15154220
 ]


Mark Miller commented on SOLR-445:
----------------------------------

bq. One notable change her is that i switched DUP.finish() from directly 
calling SOlrQueryResponse.setException() and instead made it throw the 
exception. Independent of this issue, the existing behavior seems like a bug / 
bad-form – what if the caller already caught some earlier exception it wants to 
return and finish() is just being called in finally?

Can we document that on the setException method?

bq. maxErrors

Yeah, works for the end user. Internally, if we had a good way to track all the 
fails in some efficient manner (we learn about them as they happen or 
something), we could perhaps use a single ConcurrentUpdateSolrClient per 
replica and be much more connection efficient. Kind of beyond this issue, but 
my interest in this issue is that it seems to be the start of that path.

> Update Handlers abort with bad documents
> ----------------------------------------
>
>                 Key: SOLR-445
>                 URL: https://issues.apache.org/jira/browse/SOLR-445
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>    Affects Versions: 1.3
>            Reporter: Will Johnson
>            Assignee: Hoss Man
>         Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, 
> SOLR-445-alternative.patch, SOLR-445-alternative.patch, 
> SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, 
> SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, 
> SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml
>
>
> Has anyone run into the problem of handling bad documents / failures mid 
> batch.  Ie:
> <add>
>   <doc>
>     <field name="id">1</field>
>   </doc>
>   <doc>
>     <field name="id">2</field>
>     <field name="myDateField">I_AM_A_BAD_DATE</field>
>   </doc>
>   <doc>
>     <field name="id">3</field>
>   </doc>
> </add>
> Right now solr adds the first doc and then aborts.  It would seem like it 
> should either fail the entire batch or log a message/return a code and then 
> continue on to add doc 3.  Option 1 would seem to be much harder to 
> accomplish and possibly require more memory while Option 2 would require more 
> information to come back from the API.  I'm about to dig into this but I 
> thought I'd ask to see if anyone had any suggestions, thoughts or comments.   
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents

Reply via email to