Re: [jira] Resolved: (SOLR-1004) Optimizing the abort command in delta import

Marc Sturlese Thu, 19 Feb 2009 00:41:49 -0800

Sorry, couldn't read yesterday... but that's exact what I was suggesting,
thank you very much!


JIRA j...@apache.org wrote:
> 
> 
>      [
> https://issues.apache.org/jira/browse/SOLR-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> ]
> 
> Shalin Shekhar Mangar resolved SOLR-1004.
> -----------------------------------------
> 
>     Resolution: Fixed
> 
> Committed revision 745742.
> 
> Thanks Marc!
> 
>> Optimizing the abort command in delta import
>> --------------------------------------------
>>
>>                 Key: SOLR-1004
>>                 URL: https://issues.apache.org/jira/browse/SOLR-1004
>>             Project: Solr
>>          Issue Type: Improvement
>>          Components: contrib - DataImportHandler
>>    Affects Versions: 1.3
>>         Environment: Java - Lucene - Solr - DataImportHandler
>>            Reporter: Marc Sturlese
>>            Assignee: Shalin Shekhar Mangar
>>            Priority: Minor
>>             Fix For: 1.4
>>
>>         Attachments: SOLR-1004.patch
>>
>>   Original Estimate: 0.5h
>>  Remaining Estimate: 0.5h
>>
>> I have seen that when abort command is called in a deltaImport, in
>> DocBuilder.java, at doDelta functions it's just checked for abortion at
>> the begining of collectDelta, after that function and at the end of
>> collectDelta.
>> The problem I have found is that if there is a big number of documents to
>> modify and abort is called in the middle of delta collection, it will not
>> take effect until all data is collected.
>> Same happens when we start deleteting or updating documents. In updating
>> case, there is an abortion check inside buildDocument but, as it is
>> called inside a "while" for all docs to update, it will keep going throw
>> all docs of the bucle and skipping them.
>> I propose to do an abortion check inside every loop of data collection
>> and after calling build document in doDelta function.
>> In the case of modifing documents, the code in DocBuilder.java would look
>> like:
>>     while (pkIter.hasNext()) {
>>       Map<String, Object> map = pkIter.next();
>>       vri.addNamespace(DataConfig.IMPORTER_NS + ".delta", map);
>>       buildDocument(vri, null, map, root, true, null);
>>       pkIter.remove();
>>       //check if abortion
>>       if (stop.get())
>>       {
>>             allPks = null ;
>>             pkIter = null ;
>>             return;
>>         }     
>>     }
>> In the case of document deletion (deleteAll function in DocBuilder): Just    
>>   
>> if (stop.get()){ break ; }     at the end of every loop and call this
>> just after deleteAll is called (in doDelta)
>>       if (stop.get())
>>       {
>>             allPks = null;
>>             deletedKeys = null;
>>             return;
>>        }
>> Finally in collect delta:
>>       while (true) {
>>          //check for abortion
>>          if (stop.get()){ return myModifiedPks; }
>>          Map<String, Object> row = entityProcessor.nextModifiedRowKey();
>>          if (row == null)
>>            break;
>>            ...
>> And the same for delete-query collection and parent-delta-query
>> collection
>> I didn't atach de patch because is the first time I open an issue and
>> don't know if you want to code it as I do. Just wanted to explain the
>> idea and how I solved, I think it can be useful for other users.
>>  
> 
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/-jira--Created%3A-%28SOLR-1004%29-Optimizing-the-abort-command-in-delta-import-tp21808783p22096080.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

Re: [jira] Resolved: (SOLR-1004) Optimizing the abort command in delta import

Reply via email to