[ https://issues.apache.org/jira/browse/SOLR-15417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17349899#comment-17349899 ]
Takashi Sasaki commented on SOLR-15417: --------------------------------------- Hi, [~hxuanyu] h1. Batch Request {quote}*This request only succeeded partly. Any doc in the batch after the invalid one failed to update*. Could you try it on your machine? {quote} When updating by batch, partial failures prevented the entire commit from being executed. This feels like the right behavior. Code: {code:java} import java.util.ArrayList; import java.util.HashMap; import java.util.List; import org.apache.solr.client.solrj.SolrClient; import org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient; import org.apache.solr.client.solrj.impl.HttpSolrClient; import org.apache.solr.client.solrj.request.UpdateRequest; import org.apache.solr.client.solrj.response.UpdateResponse; import org.apache.solr.common.SolrInputDocument; import static java.lang.System.*; public class Reproduce { public static void main(String[] args) throws Exception { SolrClient solrClient = new HttpSolrClient.Builder().withBaseSolrUrl("http://localhost:8983/solr/techproducts").build(); // SolrClient solrClient = new ConcurrentUpdateSolrClient.Builder("http://localhost:8983/solr/techproducts").withThreadCount(4).withQueueSize(500).build(); List<String> idList = List.of("TWINX2048-3200PRO", "VS1GB400C3", "VDBDB1A16", "MA147LL/A", "F8V7067-APL-KIT"); List<SolrInputDocument> batch = new ArrayList<>(); for(int idx = 1; idx <= idList.size(); idx++) { SolrInputDocument doc = new SolrInputDocument(); if (idx == 3) { doc.addField("id", idList.get(idx - 1) + "_invalid"); } else { doc.addField("id", idList.get(idx - 1)); } doc.addField("hasUserAssertions", new HashMap<String, Object>() {{ put("set", false); }}); // this makes sure update only succeeds when record with specified id exists doc.addField("_version_", 1); out.println("Added solr doc for record: " + doc.get("id")); batch.add(doc); } UpdateRequest updateRequest = new UpdateRequest(); updateRequest.setAction(UpdateRequest.ACTION.COMMIT, false, false); updateRequest.setParam("failOnVersionConflicts", "false"); updateRequest.add(batch); // List<SolrInputDocument> batch updateRequest.lastDocInBatch(); try { UpdateResponse process = updateRequest.process(solrClient); out.println("xhk205 process = " + process.toString()); } catch (Exception e) { out.println("Failed to update solr doc, error message: " + e.getMessage()); } } } {code} Output: {code:java} Added solr doc for record: id=TWINX2048-3200PRO Added solr doc for record: id=VS1GB400C3 Added solr doc for record: id=VDBDB1A16_invalid Added solr doc for record: id=MA147LL/A Added solr doc for record: id=F8V7067-APL-KIT Failed to update solr doc, error message: Error from server at http://localhost:8983/solr/techproducts: Document not found for update. id=VDBDB1A16_invalid {code} Query: [http://localhost:8983/solr/techproducts/select?fq=hasUserAssertions:true&q=*:*] {code:java} {"responseHeader": {"status": 0,"QTime": 1,"params": {"q": "*:*","fq": "hasUserAssertions:true"}},"response": {"numFound": 0,"start": 0,"docs": []}} {code} h1. Individual requests using parallel processing [https://solr.apache.org/docs/8_8_2/solr-solrj/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.html] {quote}ConcurrentUpdateSolrClient buffers all added documents and writes them into open HTTP connections. {quote} Depending on the queue size, the documents that are requested together will vary. If I specify a queue size of 500 for 5 documents, they will all be requested at once. Therefore, if there is one error, 5 documents will not be committed. If you specify a queue size of 1, all documents will be requested individually. In this case, all documents will be committed except for the documents with errors. What you need to do for your problem is probably to check the result of the request and handle the errors, right? (You will need to extend the ConcurrentUpdateSolrClient.) [https://solr.apache.org/docs/8_8_2/solr-solrj/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.html#handleError-java.lang.Throwable-] Code: {code:java} import java.util.HashMap; import java.util.List; import org.apache.solr.client.solrj.SolrClient; import org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient; import org.apache.solr.client.solrj.request.UpdateRequest; import org.apache.solr.client.solrj.response.UpdateResponse; import org.apache.solr.common.SolrInputDocument; import static java.lang.System.*; public class Reproduce { public static void main(String[] args) throws Exception { // SolrClient solrClient = new HttpSolrClient.Builder().withBaseSolrUrl("http://localhost:8983/solr/techproducts").build(); ConcurrentUpdateSolrClient solrClient = new ConcurrentUpdateSolrClient.Builder("http://localhost:8983/solr/techproducts").withThreadCount(4).withQueueSize(500).build(); List<String> idList = List.of("TWINX2048-3200PRO", "VS1GB400C3", "VDBDB1A16", "MA147LL/A", "F8V7067-APL-KIT"); for(int idx = 1; idx <= idList.size(); idx++) { UpdateRequest updateRequest = new UpdateRequest(); updateRequest.setAction( UpdateRequest.ACTION.COMMIT, false, false); SolrInputDocument doc = new SolrInputDocument(); if (idx == 3) { doc.addField("id", idList.get(idx - 1) + "_invalid"); } else { doc.addField("id", idList.get(idx - 1)); } doc.addField("hasUserAssertions", new HashMap<String, Object>() {{ put("set", true); }}); // this makes sure update only succeeds when record with specified id exists doc.addField("_version_", 1); out.println("Added solr doc for record: " + doc.get("id")); updateRequest.add(doc); try { updateRequest.setParam("failOnVersionConflicts", "true"); UpdateResponse process = updateRequest.process(solrClient); out.println("xhk205 process = " + process.toString()); } catch (Exception e) { out.println("Failed to update solr doc, error message: " + e.getMessage()); } } solrClient.blockUntilFinished(); } } {code} Output: {code:java} Added solr doc for record: id=TWINX2048-3200PRO xhk205 process = {NOTE=the request is processed in a background stream} Added solr doc for record: id=VS1GB400C3 xhk205 process = {NOTE=the request is processed in a background stream} Added solr doc for record: id=VDBDB1A16_invalid xhk205 process = {NOTE=the request is processed in a background stream} Added solr doc for record: id=MA147LL/A xhk205 process = {NOTE=the request is processed in a background stream} Added solr doc for record: id=F8V7067-APL-KIT xhk205 process = {NOTE=the request is processed in a background stream} {code} Query: [http://localhost:8983/solr/techproducts/select?fq=hasUserAssertions:true&q=*:*] {code:java} {responseHeader: {status: 0,QTime: 0,params: {q: "*:*",fq: "hasUserAssertions:true"}},response: {numFound: 0,start: 0,docs: [ ]}} {code} > exception in updateRequest caused all subsequent update fail > ------------------------------------------------------------ > > Key: SOLR-15417 > URL: https://issues.apache.org/jira/browse/SOLR-15417 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: UpdateRequestProcessors > Affects Versions: 8.5.1 > Reporter: xuanyu huang > Priority: Minor > > Hi there, > I'm using solrj 8.8.2 for a 8.5.1 solr server. I have a list of records and > in a for loop I construct an updateRequest to update each record. > Code looks like this > {code:java} > for (Map<String, Object> map : maps) { > if (map.containsKey("record_uuid")) { > UpdateRequest updateRequest = new UpdateRequest(); > updateRequest.setAction( UpdateRequest.ACTION.COMMIT, false, false); > SolrInputDocument doc = new SolrInputDocument(); > if (idx == 3) { > doc.addField("id", map.get("record_uuid") + "_invalid"); > } else { > doc.addField("id", map.get("record_uuid")); > } > idx++; > doc.addField("hasUserAssertions", new HashMap<String, Object>() {{ > put("set", true); }}); > // this makes sure update only succeeds when record with specified id > exists > doc.addField("_version_", 1); > logger.debug("Added solr doc for record: " + doc.get("id")); > updateRequest.add(doc); > try { > updateRequest.setParam("failOnVersionConflicts", "false"); > UpdateResponse process = updateRequest.process(solrClient); > System.out.println("xhk205 process = " + process.toString()); > } catch (Exception e) { > logger.error("Failed to update solr doc, error message: " + > e.getMessage(), e); > } > }{code} > There are 5 requests in total and I intentionally set the id in 3rd request > to be an invalid id so that updateRequet for 3rd record should fail. (This is > to mimic the situation where the record to be updated no longer exists in > solr, so I only want those updates with a valid id to succeed, those updates > with an invalid id should fail/rejected instead of creating a new reocrd in > solr, so I used __version__=1). > > Also I used the syntax to do partial update. > The variable doc looks like this > {code:java} > { > "id":"2d4b625d-8809-461f-b19b-d0c963e038ed", > "hasUserAssertions":{"set":true} > } > {code} > > {color:#de350b}Since each update is put into its own request, I suppose only > the 3rd request will fail because there's no record with that id and I've set > __version__{color} {color:#de350b}to 1. But the reality is, only the first 2 > records were updated and other 3 not.{color} > {color:#de350b}When I queried in solr admin console after the update, with > [http://localhost:8983/solr/biocache/select?fq=hasUserAssertions:true&q=*:*] > there were only 2 records returned instead of 4.{color} > > Below is the log of IntelliJ IDEA: > > {code:java} > - Added solr doc for record: id=429cfa88-2e18-46b0-ab9f-f4efd9e36c3c > xhk205 process = {NOTE=the request is processed in a background stream} > - Added solr doc for record: id=5a80561b-a68d-46a3-a59b-03d267f35d0e > xhk205 process = {NOTE=the request is processed in a background stream} > - Added solr doc for record: id=ff2dcbee-9c05-491f-91a8-9f1fec348546_invalid > xhk205 process = {NOTE=the request is processed in a background stream} > - Added solr doc for record: id=baf7af1f-1525-403a-95bf-e28e432f1b12 > xhk205 process = {NOTE=the request is processed in a background stream} > - Added solr doc for record: id=4ea76605-c262-409b-845e-213f11ea4e34 > xhk205 process = {NOTE=the request is processed in a background stream}{code} > {code:java} > 2021-05-19 14:12:16,827 ERROR: [ConcurrentUpdateSolrClient] - error > org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error > from server at http://localhost:8983/solr/biocache: Conflict request: > http://localhost:8983/solr/biocache/update?commit=true&softCommit=false&waitSearcher=false&failOnVersionConflicts=false&wt=javabin&version=2 > Remote error message: Document not found for update. > id=ff2dcbee-9c05-491f-91a8-9f1fec348546_invalid at > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:394) > at > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:191) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0{code} > > > {color:#de350b}The 3rd update obviously caused an exception. But why 4th and > 5th updates didn't succeed? Is it possible that this exception caused solr > client or server in some non-useable state so all subsequent updates > failed?{color} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org