[jira] Commented: (SOLR-445) XmlUpdateRequestHandler bad documents mid batch aborts rest of batch

Simon Rosenthal (JIRA) Tue, 18 Jan 2011 07:05:10 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12983215#action_12983215
 ]


Simon Rosenthal commented on SOLR-445:
--------------------------------------

bq.  Don't allow autocommits during an update. Simple. Or, rather, all update 
requests block at the beginning during an autocommit. If an update request has 
too many documents, don't do so many documents in an update. (Lance)
Lance - How do you (dynamically ) disable autocommits during a specific update  
? That functionality would also be useful in other use cases, but that's 
another issue). 

bq. NOTE: This does change the behavior of Solr. Without this patch, the first 
document that is incorrect stops processing. Now, it continues merrily on 
adding documents as it can. Is this desirable behavior? It would be easy to 
abort on first error if that's the consensus, and I could take some tedious 
record-keeping out. I think there's no big problem with continuing on, since 
the state of committed documents is indeterminate already when errors occur so 
worrying about this should be part of a bigger issue.

I think it should be an option, if possible. I can see use cases where 
abort-on-first-error is desirable, but also situations where you know one or 
two documents may be erroneous, and its worth continuing on in order to index 
the other 99%


> XmlUpdateRequestHandler bad documents mid batch aborts rest of batch
> --------------------------------------------------------------------
>
>                 Key: SOLR-445
>                 URL: https://issues.apache.org/jira/browse/SOLR-445
>             Project: Solr
>          Issue Type: Bug
>          Components: update
>    Affects Versions: 1.3
>            Reporter: Will Johnson
>            Assignee: Erick Erickson
>             Fix For: Next
>
>         Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, 
> solr-445.xml
>
>
> Has anyone run into the problem of handling bad documents / failures mid 
> batch.  Ie:
> <add>
>   <doc>
>     <field name="id">1</field>
>   </doc>
>   <doc>
>     <field name="id">2</field>
>     <field name="myDateField">I_AM_A_BAD_DATE</field>
>   </doc>
>   <doc>
>     <field name="id">3</field>
>   </doc>
> </add>
> Right now solr adds the first doc and then aborts.  It would seem like it 
> should either fail the entire batch or log a message/return a code and then 
> continue on to add doc 3.  Option 1 would seem to be much harder to 
> accomplish and possibly require more memory while Option 2 would require more 
> information to come back from the API.  I'm about to dig into this but I 
> thought I'd ask to see if anyone had any suggestions, thoughts or comments.   
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-445) XmlUpdateRequestHandler bad documents mid batch aborts rest of batch

Reply via email to