[ 
https://issues.apache.org/jira/browse/SOLR-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1004.
-----------------------------------------

    Resolution: Fixed

Committed revision 745742.

Thanks Marc!

> Optimizing the abort command in delta import
> --------------------------------------------
>
>                 Key: SOLR-1004
>                 URL: https://issues.apache.org/jira/browse/SOLR-1004
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.3
>         Environment: Java - Lucene - Solr - DataImportHandler
>            Reporter: Marc Sturlese
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1004.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> I have seen that when abort command is called in a deltaImport, in 
> DocBuilder.java, at doDelta functions it's just checked for abortion at the 
> begining of collectDelta, after that function and at the end of collectDelta.
> The problem I have found is that if there is a big number of documents to 
> modify and abort is called in the middle of delta collection, it will not 
> take effect until all data is collected.
> Same happens when we start deleteting or updating documents. In updating 
> case, there is an abortion check inside buildDocument but, as it is called 
> inside a "while" for all docs to update, it will keep going throw all docs of 
> the bucle and skipping them.
> I propose to do an abortion check inside every loop of data collection and 
> after calling build document in doDelta function.
> In the case of modifing documents, the code in DocBuilder.java would look 
> like:
>     while (pkIter.hasNext()) {
>       Map<String, Object> map = pkIter.next();
>       vri.addNamespace(DataConfig.IMPORTER_NS + ".delta", map);
>       buildDocument(vri, null, map, root, true, null);
>       pkIter.remove();
>       //check if abortion
>       if (stop.get())
>       {
>             allPks = null ;
>             pkIter = null ;
>             return;
>         }     
>     }
> In the case of document deletion (deleteAll function in DocBuilder): Just     
>   if (stop.get()){ break ; }     at the end of every loop and call this just 
> after deleteAll is called (in doDelta)
>       if (stop.get())
>       {
>             allPks = null;
>             deletedKeys = null;
>             return;
>        }
> Finally in collect delta:
>       while (true) {
>          //check for abortion
>          if (stop.get()){ return myModifiedPks; }
>          Map<String, Object> row = entityProcessor.nextModifiedRowKey();
>          if (row == null)
>            break;
>            ...
> And the same for delete-query collection and parent-delta-query collection
> I didn't atach de patch because is the first time I open an issue and don't 
> know if you want to code it as I do. Just wanted to explain the idea and how 
> I solved, I think it can be useful for other users.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to