I want to modify Nutch for increasing the score of some pages in their
CrawlDatum. The objective of this is recognizing which pages include a
certain token. Increasing the score to a high value will be useful for being
chosen again in the next Segment generation.
I modified like this:
Fetcher.j
[
https://issues.apache.org/jira/browse/NUTCH-798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-798.
-
Resolution: Fixed
Updated SOLRJ's dependencies at the same time :
Deleting lib/apache-solr
[
https://issues.apache.org/jira/browse/NUTCH-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-801.
-
Resolution: Fixed
Committed revision 921840.
> Remove RTF and MP3 parse plugins
> --
Hi everyone
I've been using nutch for a while now and i've come up on a snag.
I'm trying to find where new linked pages are added to the segment as a
specific entry.
To make myself clear i've been through the fetch class and the crawlDBFilter
and reducer.
But i'm looking for the initial entry w
Please
Keep me out this Group.
Tks
___
Jesiel A.S. Trevisan
Email: jesieltrevi...@gmail.com.br
MSN: jesieltrevi...@hotmail.com
Skype & AIM: jesieltrevisan
YahooMessager: jesiel.trevisan
ICQ:: 46527510
[
https://issues.apache.org/jira/browse/NUTCH-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844352#action_12844352
]
Hudson commented on NUTCH-801:
--
Integrated in Nutch-trunk #1093 (See
[http://hudson.zones.apac
[
https://issues.apache.org/jira/browse/NUTCH-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844351#action_12844351
]
Hudson commented on NUTCH-798:
--
Integrated in Nutch-trunk #1093 (See
[http://hudson.zones.apac