[ https://issues.apache.org/jira/browse/SOLR-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700087#action_12700087 ]
Noble Paul commented on SOLR-1120: ---------------------------------- bq.Is there any ligitimate place where one would want to disallow replaceTokens? yes . the XPathEntityProcessor uses it directly just to know what are the variables in the url so that it can read them and store . probably we can add amethod getEntityAttributeResolved() to get the resolved value > Simplify EntityProcessor API > ---------------------------- > > Key: SOLR-1120 > URL: https://issues.apache.org/jira/browse/SOLR-1120 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler > Affects Versions: 1.3 > Reporter: Shalin Shekhar Mangar > Assignee: Shalin Shekhar Mangar > Fix For: 1.4 > > > Writing an EntityProcessor is deceptively complex. There are so many gotchas. > I propose the following: > # Extract out the Transformer application logic from EntityProcessor and add > it to DocBuilder. Then EntityProcessor do not need to call applyTransformer > or know about rowIterator and getFromRowCache() methods. > # Change the meaning of EntityProcessor#destroy to be called on end of > parent's row -- Right now init is called once per parent row but destroy > actually means the end of import. In fact, there is no correct way for an > entity processor to do clean up right now. Most do clean up when returning > null (end of data) but with the introduction of $skipDoc, a transformer can > return $skipDoc and the entity processor will never get a chance to clean up > for the current init. > # EntityProcessor will use the EventListener API to listen for import end. > This should be used by EntityProcessor to do a final cleanup. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.