[ 
https://issues.apache.org/jira/browse/SOLR-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703216#action_12703216
 ] 

Shalin Shekhar Mangar commented on SOLR-1120:
---------------------------------------------

Committed revision 769058.

Thanks Noble!

> Simplify EntityProcessor API
> ----------------------------
>
>                 Key: SOLR-1120
>                 URL: https://issues.apache.org/jira/browse/SOLR-1120
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.3
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Shalin Shekhar Mangar
>             Fix For: 1.4
>
>         Attachments: SOLR-1120.patch, SOLR-1120.patch, SOLR-1120.patch, 
> SOLR-1120.patch, SOLR-1120.patch, SOLR-1120.patch, SOLR-1120.patch
>
>
> Writing an EntityProcessor is deceptively complex. There are so many gotchas.
> I propose the following:
> # Extract out the Transformer application logic from EntityProcessor and add 
> it to DocBuilder. Then EntityProcessor do not need to call applyTransformer 
> or know about rowIterator and getFromRowCache() methods.
> # Change the meaning of EntityProcessor#destroy to be called on end of 
> parent's row -- Right now init is called once per parent row but destroy 
> actually means the end of import. In fact, there is no correct way for an 
> entity processor to do clean up right now. Most do clean up when returning 
> null (end of data) but with the introduction of $skipDoc, a transformer can 
> return $skipDoc and the entity processor will never get a chance to clean up 
> for the current init.
> # EntityProcessor will use the EventListener API to listen for import end. 
> This should be used by EntityProcessor to do a final cleanup.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to