Allow predeterminate running order of index filters
---
Key: NUTCH-421
URL: http://issues.apache.org/jira/browse/NUTCH-421
Project: Nutch
Issue Type: Improvement
Components: indexer
[ http://issues.apache.org/jira/browse/NUTCH-421?page=all ]
Alan Tanaman updated NUTCH-421:
---
Description:
I've tested a patch for org.apache.nutch.indexer.IndexingFilters, allowing the
user to state in which order the indexing filters are to be run based
[ http://issues.apache.org/jira/browse/NUTCH-421?page=all ]
Alan Tanaman updated NUTCH-421:
---
Attachment: nutch-421.patch
Allow predeterminate running order of index filters
---
Key:
[ http://issues.apache.org/jira/browse/NUTCH-421?page=all ]
Alan Tanaman updated NUTCH-421:
---
Description:
I've tested a patch for org.apache.nutch.indexer.IndexingFilters, allowing the
user to state in which order the indexing filters are to be run based
[ http://issues.apache.org/jira/browse/NUTCH-415?page=all ]
Andrzej Bialecki closed NUTCH-415.
---
Fix Version/s: (was: 0.8.2)
Resolution: Fixed
Fixed in trunk, rev. 490607 . Locking has been added, but it's still possible
to force
[ http://issues.apache.org/jira/browse/NUTCH-416?page=all ]
Andrzej Bialecki closed NUTCH-416.
---
Resolution: Fixed
Fixed in trunk, rev. 490607. As a side effect it is now possible to correctly
update CrawlDB from multiple segments, even if they
[ http://issues.apache.org/jira/browse/NUTCH-322?page=all ]
Andrzej Bialecki closed NUTCH-322.
---
Resolution: Fixed
Fixed in trunk/, rev. 490607 . NOTE: this doesn't solve the whole issue of
proper handling of redirected pages from the point of view
[ http://issues.apache.org/jira/browse/NUTCH-273?page=all ]
Andrzej Bialecki closed NUTCH-273.
---
Fix Version/s: 0.9.0
Resolution: Fixed
Assignee: Andrzej Bialecki
Fixed in trunk/, rev. 490607 .
When a page is redirected, the
[ http://issues.apache.org/jira/browse/NUTCH-274?page=all ]
Andrzej Bialecki closed NUTCH-274.
---
Fix Version/s: 0.8.2
0.9.0
Resolution: Fixed
Assignee: Andrzej Bialecki
This bug has been fixed in recent versions of