Segment merge filering based on segment content -----------------------------------------------
Key: NUTCH-677 URL: https://issues.apache.org/jira/browse/NUTCH-677 Project: Nutch Issue Type: Improvement Affects Versions: 0.9.0 Reporter: Marcin Okraszewski Fix For: 0.9.0 I needed a segment filtering based on meta data detected during parse phase. Unfortunately current URL based filtering does not allow for this. So I have created a new SegmentMergeFilter extension which receives segment entry which is being merged and decides if it should be included or not. Even though I needed only ParseData for my purpose I have done it a bit more general purpose, so the filter receives all merged data. The attached patch is for version 0.9 which I use. Unfortunately I didn't have time to check how it fits to trunk version. Sorry :( -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.