[ https://issues.apache.org/jira/browse/SOLR-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703216#action_12703216 ]
Shalin Shekhar Mangar commented on SOLR-1120: --------------------------------------------- Committed revision 769058. Thanks Noble! > Simplify EntityProcessor API > ---------------------------- > > Key: SOLR-1120 > URL: https://issues.apache.org/jira/browse/SOLR-1120 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler > Affects Versions: 1.3 > Reporter: Shalin Shekhar Mangar > Assignee: Shalin Shekhar Mangar > Fix For: 1.4 > > Attachments: SOLR-1120.patch, SOLR-1120.patch, SOLR-1120.patch, > SOLR-1120.patch, SOLR-1120.patch, SOLR-1120.patch, SOLR-1120.patch > > > Writing an EntityProcessor is deceptively complex. There are so many gotchas. > I propose the following: > # Extract out the Transformer application logic from EntityProcessor and add > it to DocBuilder. Then EntityProcessor do not need to call applyTransformer > or know about rowIterator and getFromRowCache() methods. > # Change the meaning of EntityProcessor#destroy to be called on end of > parent's row -- Right now init is called once per parent row but destroy > actually means the end of import. In fact, there is no correct way for an > entity processor to do clean up right now. Most do clean up when returning > null (end of data) but with the introduction of $skipDoc, a transformer can > return $skipDoc and the entity processor will never get a chance to clean up > for the current init. > # EntityProcessor will use the EventListener API to listen for import end. > This should be used by EntityProcessor to do a final cleanup. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.