[ https://issues.apache.org/jira/browse/SOLR-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386432#comment-14386432 ]
Per Steffensen commented on SOLR-6816: -------------------------------------- bq. I think you mis-understood my point. I wasn't talking about retrying documents in the same UpdateRequest. If a Map/Reduce task fails, the HDFS block is retried entirely, meaning a Hadoop-based indexing job may send the same docs that have already been added so using overwite=false is dangerous when doing this type of bulk indexing. The solution proposed in SOLR-3382 would be great to have as well though. Well, we might be mis-understanding each other. Im am not talking about retrying documents in the same UpdateRequest either. What we have: Our indexing client (something not in Solr - think of it as the Map/Reduce job) decides to do 1000 update-doc-commands U1, U2, ... , U1000 (add-doc and delete-doc commands), by sending one bulk-job containing all of those to Solr-node S1. S1 handles some of the Us itself and forwards other Us to the other Solr-nodes - depending or routing. For simplicity lets say that we have three Solr-nodes S1, S2 and S3 and that S1 handles U1-U333 itself, forwards U334-U666 to S2 and U667-U1000 to S3. Now lets say that U100, U200, U400, U500, U700 and U800 fails (two on each Solr-node), and the rest succeeds. S1 gets that information back from S2 and S3 (including reasons for each U that failed), and is able to send a response to our indexing client saying that all was a success, except that U100, U200, U400, U500, U700 and U800 failed, and why they failed. Some might fail due to DocumentDoNotExist (if U was about creating a new document, assuming that it does not already exist), others might fail due to VersionConflict (if U was about updating an existing document and includes its last known (to the client) version, but the document at server has a higher version-number), other might fail due to DocumentDoesNotExist (if U was about updating an existing document, but that document does not exist (anylonger) at server). Our indexing client takes note of that combined response from S1, perform the appropriate actions (e.g. version-lookups) and sends a new request to the Solr-cluster now only including U100', U200', U400', U500', U700' and U800'. We have done it like that for a long time, using our solution to EDR-3382 (and our solution to SOLR-3178). I would expect a Map/Reduce-job could do the same, playing the role as the indexing client. Essentially only resending (maybe by issuing a new Map/Reduce-job from the "reduce"-phase of the first Map/Reduce-job) the (modified) update-commands that failed the first time. > Review SolrCloud Indexing Performance. > -------------------------------------- > > Key: SOLR-6816 > URL: https://issues.apache.org/jira/browse/SOLR-6816 > Project: Solr > Issue Type: Task > Components: SolrCloud > Reporter: Mark Miller > Priority: Critical > Attachments: SolrBench.pdf > > > We have never really focused on indexing performance, just correctness and > low hanging fruit. We need to vet the performance and try to address any > holes. > Note: A common report is that adding any replication is very slow. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org