[jira] Updated: (NUTCH-421) Allow predeterminate running order of index filters

2006-12-27 Thread Alan Tanaman (JIRA)
 [ http://issues.apache.org/jira/browse/NUTCH-421?page=all ]

Alan Tanaman updated NUTCH-421:
---

Description: 
I've tested a patch for org.apache.nutch.indexer.IndexingFilters, allowing the 
user to state in which order the indexing filters are to be run based on a new
indexingfilter.order property. This is needed when a filter needs to rely on 
previously generated document fields as a source of input to generate further 
fields.

As suggested elsewhere, I based this on the urlfilter.order functionality:

property
  nameindexingfilter.order/name
  valueorg.apache.nutch.indexer.basic.BasicIndexingFilter 
org.apache.nutch.indexer.more.MoreIndexingFilter/value
  descriptionThe order by which index filters are applied.
  If empty, all available index filters (as dictated by properties
  plugin-includes and plugin-excludes above) are loaded and applied in system
  defined order. If not empty, only named filters are loaded and applied
  in given order. For example, if this property has value:
  org.apache.nutch.indexer.basic.BasicIndexingFilter 
org.apache.nutch.indexer.more.MoreIndexingFilter
  then BasicIndexingFilter is applied first, and MoreIndexingFilter second.
  Since all filters are AND'ed, filter ordering does not have impact
  on end result, but it may have performance implication, depending
  on relative expensiveness of filters.
  /description
/property

Patch will be attached to this issue by 29/12/06

  was:
I've tested a patch for org.apache.nutch.indexer.IndexingFilters, allowing the 
user to state in which order the indexing filters are to be run based on a new
indexingfilter.order property. This is needed when a filter needs to rely on 
previously generated document fields as a source of input to generate further 
fields.

As suggested elsewhere, I based this on the urlfilter.order functionality:

property
  nameindexingfilter.order/name
  valueorg.apache.nutch.indexer.basic.BasicIndexingFilter 
org.apache.nutch.indexer.more.MoreIndexingFilter/value
  descriptionThe order by which index filters are applied.
  If empty, all available index filters (as dictated by properties
  plugin-includes and plugin-excludes above) are loaded and applied in system
  defined order. If not empty, only named filters are loaded and applied
  in given order. For example, if this property has value:
  org.apache.nutch.indexer.basic.BasicIndexingFilter 
org.apache.nutch.indexer.more.MoreIndexingFilter
  then BasicIndexingFilter is applied first, and MoreIndexingFilter second.
  Since all filters are AND'ed, filter ordering does not have impact
  on end result, but it may have performance implication, depending
  on relative expensiveness of filters.
  /description
/property




 Allow predeterminate running order of index filters
 ---

 Key: NUTCH-421
 URL: http://issues.apache.org/jira/browse/NUTCH-421
 Project: Nutch
  Issue Type: Improvement
  Components: indexer
Affects Versions: 0.8.1
 Environment: All
Reporter: Alan Tanaman
Priority: Minor

 I've tested a patch for org.apache.nutch.indexer.IndexingFilters, allowing 
 the user to state in which order the indexing filters are to be run based on 
 a new
 indexingfilter.order property. This is needed when a filter needs to rely on 
 previously generated document fields as a source of input to generate further 
 fields.
 As suggested elsewhere, I based this on the urlfilter.order functionality:
 property
   nameindexingfilter.order/name
   valueorg.apache.nutch.indexer.basic.BasicIndexingFilter 
 org.apache.nutch.indexer.more.MoreIndexingFilter/value
   descriptionThe order by which index filters are applied.
   If empty, all available index filters (as dictated by properties
   plugin-includes and plugin-excludes above) are loaded and applied in system
   defined order. If not empty, only named filters are loaded and applied
   in given order. For example, if this property has value:
   org.apache.nutch.indexer.basic.BasicIndexingFilter 
 org.apache.nutch.indexer.more.MoreIndexingFilter
   then BasicIndexingFilter is applied first, and MoreIndexingFilter second.
   Since all filters are AND'ed, filter ordering does not have impact
   on end result, but it may have performance implication, depending
   on relative expensiveness of filters.
   /description
 /property
 Patch will be attached to this issue by 29/12/06

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (NUTCH-421) Allow predeterminate running order of index filters

2006-12-27 Thread Alan Tanaman (JIRA)
 [ http://issues.apache.org/jira/browse/NUTCH-421?page=all ]

Alan Tanaman updated NUTCH-421:
---

Attachment: nutch-421.patch

 Allow predeterminate running order of index filters
 ---

 Key: NUTCH-421
 URL: http://issues.apache.org/jira/browse/NUTCH-421
 Project: Nutch
  Issue Type: Improvement
  Components: indexer
Affects Versions: 0.8.1
 Environment: All
Reporter: Alan Tanaman
Priority: Minor
 Attachments: nutch-421.patch


 I've tested a patch for org.apache.nutch.indexer.IndexingFilters, allowing 
 the user to state in which order the indexing filters are to be run based on 
 a new
 indexingfilter.order property. This is needed when a filter needs to rely on 
 previously generated document fields as a source of input to generate further 
 fields.
 As suggested elsewhere, I based this on the urlfilter.order functionality:
 property
   nameindexingfilter.order/name
   valueorg.apache.nutch.indexer.basic.BasicIndexingFilter 
 org.apache.nutch.indexer.more.MoreIndexingFilter/value
   descriptionThe order by which index filters are applied.
   If empty, all available index filters (as dictated by properties
   plugin-includes and plugin-excludes above) are loaded and applied in system
   defined order. If not empty, only named filters are loaded and applied
   in given order. For example, if this property has value:
   org.apache.nutch.indexer.basic.BasicIndexingFilter 
 org.apache.nutch.indexer.more.MoreIndexingFilter
   then BasicIndexingFilter is applied first, and MoreIndexingFilter second.
   Since all filters are AND'ed, filter ordering does not have impact
   on end result, but it may have performance implication, depending
   on relative expensiveness of filters.
   /description
 /property
 Patch will be attached to this issue by 29/12/06

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (NUTCH-421) Allow predeterminate running order of index filters

2006-12-27 Thread Alan Tanaman (JIRA)
 [ http://issues.apache.org/jira/browse/NUTCH-421?page=all ]

Alan Tanaman updated NUTCH-421:
---

Description: 
I've tested a patch for org.apache.nutch.indexer.IndexingFilters, allowing the 
user to state in which order the indexing filters are to be run based on a new
indexingfilter.order property. This is needed when a filter needs to rely on 
previously generated document fields as a source of input to generate further 
fields.

As suggested elsewhere, I based this on the urlfilter.order functionality:

property
  nameindexingfilter.order/name
  valueorg.apache.nutch.indexer.basic.BasicIndexingFilter 
org.apache.nutch.indexer.more.MoreIndexingFilter/value
  descriptionThe order by which index filters are applied.
  If empty, all available index filters (as dictated by properties
  plugin-includes and plugin-excludes above) are loaded and applied in system
  defined order. If not empty, only named filters are loaded and applied
  in given order. For example, if this property has value:
  org.apache.nutch.indexer.basic.BasicIndexingFilter 
org.apache.nutch.indexer.more.MoreIndexingFilter
  then BasicIndexingFilter is applied first, and MoreIndexingFilter second.
  Since all filters are AND'ed, filter ordering does not have impact
  on end result, but it may have performance implication, depending
  on relative expensiveness of filters.
  /description
/property


  was:
I've tested a patch for org.apache.nutch.indexer.IndexingFilters, allowing the 
user to state in which order the indexing filters are to be run based on a new
indexingfilter.order property. This is needed when a filter needs to rely on 
previously generated document fields as a source of input to generate further 
fields.

As suggested elsewhere, I based this on the urlfilter.order functionality:

property
  nameindexingfilter.order/name
  valueorg.apache.nutch.indexer.basic.BasicIndexingFilter 
org.apache.nutch.indexer.more.MoreIndexingFilter/value
  descriptionThe order by which index filters are applied.
  If empty, all available index filters (as dictated by properties
  plugin-includes and plugin-excludes above) are loaded and applied in system
  defined order. If not empty, only named filters are loaded and applied
  in given order. For example, if this property has value:
  org.apache.nutch.indexer.basic.BasicIndexingFilter 
org.apache.nutch.indexer.more.MoreIndexingFilter
  then BasicIndexingFilter is applied first, and MoreIndexingFilter second.
  Since all filters are AND'ed, filter ordering does not have impact
  on end result, but it may have performance implication, depending
  on relative expensiveness of filters.
  /description
/property

Patch will be attached to this issue by 29/12/06


 Allow predeterminate running order of index filters
 ---

 Key: NUTCH-421
 URL: http://issues.apache.org/jira/browse/NUTCH-421
 Project: Nutch
  Issue Type: Improvement
  Components: indexer
Affects Versions: 0.8.1
 Environment: All
Reporter: Alan Tanaman
Priority: Minor
 Attachments: nutch-421.patch


 I've tested a patch for org.apache.nutch.indexer.IndexingFilters, allowing 
 the user to state in which order the indexing filters are to be run based on 
 a new
 indexingfilter.order property. This is needed when a filter needs to rely on 
 previously generated document fields as a source of input to generate further 
 fields.
 As suggested elsewhere, I based this on the urlfilter.order functionality:
 property
   nameindexingfilter.order/name
   valueorg.apache.nutch.indexer.basic.BasicIndexingFilter 
 org.apache.nutch.indexer.more.MoreIndexingFilter/value
   descriptionThe order by which index filters are applied.
   If empty, all available index filters (as dictated by properties
   plugin-includes and plugin-excludes above) are loaded and applied in system
   defined order. If not empty, only named filters are loaded and applied
   in given order. For example, if this property has value:
   org.apache.nutch.indexer.basic.BasicIndexingFilter 
 org.apache.nutch.indexer.more.MoreIndexingFilter
   then BasicIndexingFilter is applied first, and MoreIndexingFilter second.
   Since all filters are AND'ed, filter ordering does not have impact
   on end result, but it may have performance implication, depending
   on relative expensiveness of filters.
   /description
 /property

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira