Sorry, couldn't read yesterday... but that's exact what I was suggesting, thank you very much!
JIRA j...@apache.org wrote: > > > [ > https://issues.apache.org/jira/browse/SOLR-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > ] > > Shalin Shekhar Mangar resolved SOLR-1004. > ----------------------------------------- > > Resolution: Fixed > > Committed revision 745742. > > Thanks Marc! > >> Optimizing the abort command in delta import >> -------------------------------------------- >> >> Key: SOLR-1004 >> URL: https://issues.apache.org/jira/browse/SOLR-1004 >> Project: Solr >> Issue Type: Improvement >> Components: contrib - DataImportHandler >> Affects Versions: 1.3 >> Environment: Java - Lucene - Solr - DataImportHandler >> Reporter: Marc Sturlese >> Assignee: Shalin Shekhar Mangar >> Priority: Minor >> Fix For: 1.4 >> >> Attachments: SOLR-1004.patch >> >> Original Estimate: 0.5h >> Remaining Estimate: 0.5h >> >> I have seen that when abort command is called in a deltaImport, in >> DocBuilder.java, at doDelta functions it's just checked for abortion at >> the begining of collectDelta, after that function and at the end of >> collectDelta. >> The problem I have found is that if there is a big number of documents to >> modify and abort is called in the middle of delta collection, it will not >> take effect until all data is collected. >> Same happens when we start deleteting or updating documents. In updating >> case, there is an abortion check inside buildDocument but, as it is >> called inside a "while" for all docs to update, it will keep going throw >> all docs of the bucle and skipping them. >> I propose to do an abortion check inside every loop of data collection >> and after calling build document in doDelta function. >> In the case of modifing documents, the code in DocBuilder.java would look >> like: >> while (pkIter.hasNext()) { >> Map<String, Object> map = pkIter.next(); >> vri.addNamespace(DataConfig.IMPORTER_NS + ".delta", map); >> buildDocument(vri, null, map, root, true, null); >> pkIter.remove(); >> //check if abortion >> if (stop.get()) >> { >> allPks = null ; >> pkIter = null ; >> return; >> } >> } >> In the case of document deletion (deleteAll function in DocBuilder): Just >> >> if (stop.get()){ break ; } at the end of every loop and call this >> just after deleteAll is called (in doDelta) >> if (stop.get()) >> { >> allPks = null; >> deletedKeys = null; >> return; >> } >> Finally in collect delta: >> while (true) { >> //check for abortion >> if (stop.get()){ return myModifiedPks; } >> Map<String, Object> row = entityProcessor.nextModifiedRowKey(); >> if (row == null) >> break; >> ... >> And the same for delete-query collection and parent-delta-query >> collection >> I didn't atach de patch because is the first time I open an issue and >> don't know if you want to code it as I do. Just wanted to explain the >> idea and how I solved, I think it can be useful for other users. >> > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. > > > -- View this message in context: http://www.nabble.com/-jira--Created%3A-%28SOLR-1004%29-Optimizing-the-abort-command-in-delta-import-tp21808783p22096080.html Sent from the Solr - Dev mailing list archive at Nabble.com.