[
https://issues.apache.org/jira/browse/NUTCH-677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcin Okraszewski updated NUTCH-677:
-
Attachment: SegmentMergeFilter.java
Added Apache License.
Segment merge filering based
[
https://issues.apache.org/jira/browse/NUTCH-677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcin Okraszewski updated NUTCH-677:
-
Attachment: SegmentMergeFilters.java
Added Apache license header.
Segment merge
[
https://issues.apache.org/jira/browse/NUTCH-677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763681#action_12763681
]
Marcin Okraszewski commented on NUTCH-677:
--
Sorry, I didn't notice the request for
[
https://issues.apache.org/jira/browse/NUTCH-740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcin Okraszewski updated NUTCH-740:
-
Attachment: AcceptLanguage_trunk_2009-06-09.patch
It does apply, but with Fuzz factor set
Configuration option to override default language for fetched pages.
Key: NUTCH-740
URL: https://issues.apache.org/jira/browse/NUTCH-740
Project: Nutch
Issue Type:
[
https://issues.apache.org/jira/browse/NUTCH-740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcin Okraszewski updated NUTCH-740:
-
Attachment: AcceptLanguage.patch
The patch which allows overriding of Accept-Language
[
https://issues.apache.org/jira/browse/NUTCH-677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcin Okraszewski updated NUTCH-677:
-
Attachment: MergeFilter_for_1.0.patch
The patch ported to Nutch 1.0. The Java files
[
https://issues.apache.org/jira/browse/NUTCH-490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcin Okraszewski updated NUTCH-490:
-
Attachment: NekoFilters_for_1.0.patch
Patch ported to Nutch 1.0. It includes the two
Segment merge filering based on segment content
---
Key: NUTCH-677
URL: https://issues.apache.org/jira/browse/NUTCH-677
Project: Nutch
Issue Type: Improvement
Affects Versions: 0.9.0
[
https://issues.apache.org/jira/browse/NUTCH-677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcin Okraszewski updated NUTCH-677:
-
Attachment: MergeFilter.patch
The patch for 0.9
Segment merge filering based on segment
[
https://issues.apache.org/jira/browse/NUTCH-677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcin Okraszewski updated NUTCH-677:
-
Attachment: SegmentMergeFilter.java
The filter interface (referred by the patch).
[
https://issues.apache.org/jira/browse/NUTCH-677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcin Okraszewski updated NUTCH-677:
-
Attachment: SegmentMergeFilters.java
Merge filter aggregation which hides extension
[
https://issues.apache.org/jira/browse/NUTCH-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcin Okraszewski updated NUTCH-488:
-
Attachment: ignore_tags_v3.patch
OK, yet another approach based on Doğacan comments.
[
https://issues.apache.org/jira/browse/NUTCH-490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcin Okraszewski updated NUTCH-490:
-
Attachment: HtmlParser.java.diff
Patch for HtmlParser.
Extension point with filters for
[
https://issues.apache.org/jira/browse/NUTCH-490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcin Okraszewski updated NUTCH-490:
-
Attachment: nutch-extensionpoins_plugin.xml.diff
Patch for plugin.xml in
Neko HTML parser goes on default settings.
--
Key: NUTCH-487
URL: https://issues.apache.org/jira/browse/NUTCH-487
Project: Nutch
Issue Type: Bug
Components: fetcher
Affects Versions:
16 matches
Mail list logo