Allow predeterminate running order of index filters ---------------------------------------------------
Key: NUTCH-421 URL: http://issues.apache.org/jira/browse/NUTCH-421 Project: Nutch Issue Type: Improvement Components: indexer Affects Versions: 0.8.1 Environment: All Reporter: Alan Tanaman Priority: Minor I've tested a patch for org.apache.nutch.indexer.IndexingFilters, allowing the user to state in which order the indexing filters are to be run based on a new indexingfilter.order property. This is needed when a filter needs to rely on previously generated document fields as a source of input to generate further fields. As suggested elsewhere, I based this on the urlfilter.order functionality: <property> <name>indexingfilter.order</name> <value>org.apache.nutch.indexer.basic.BasicIndexingFilter org.apache.nutch.indexer.more.MoreIndexingFilter</value> <description>The order by which index filters are applied. If empty, all available index filters (as dictated by properties plugin-includes and plugin-excludes above) are loaded and applied in system defined order. If not empty, only named filters are loaded and applied in given order. For example, if this property has value: org.apache.nutch.indexer.basic.BasicIndexingFilter org.apache.nutch.indexer.more.MoreIndexingFilter then BasicIndexingFilter is applied first, and MoreIndexingFilter second. Since all filters are AND'ed, filter ordering does not have impact on end result, but it may have performance implication, depending on relative expensiveness of filters. </description> </property> -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira