Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change
notification.
The following page has been changed by PiotrBazan:
http://wiki.apache.org/nutch/RunNutchInEclipse0%2e9
--
*
[
https://issues.apache.org/jira/browse/NUTCH-386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Beaucarnea updated NUTCH-386:
-
Attachment: index-url-category.jar
This plugin uses the deprecated org.apache.hadoop.io.UTF8 which caused
Where can I find Scott's solution? I am trying to do it exactly like Scott,
but i cannot imagine how to index items separately.
Please, can anybody help me?
Many thanks
Miro
sdeck wrote:
>
> So, here is what I do for RSS Feeds.
>
> I parse the rss, and for each outlink, I create the outlink
Using nutch 0.9 (hadoop 0.17.1):
[EMAIL PROTECTED] working]$ bin/nutch readlinkdb
/home/hadoop/crawl-20081201/crawldb -dump crawled_urls.txt
LinkDb dump: starting
LinkDb db: /home/hadoop/crawl-urls-20081201/crawldb
java.io.IOException: Type mismatch in value from map: expected
org.apache.nutch.cra
On Wed, Dec 3, 2008 at 8:55 PM, brainstorm <[EMAIL PROTECTED]> wrote:
> Using nutch 0.9 (hadoop 0.17.1):
>
> [EMAIL PROTECTED] working]$ bin/nutch readlinkdb
> /home/hadoop/crawl-20081201/crawldb -dump crawled_urls.txt
> LinkDb dump: starting
> LinkDb db: /home/hadoop/crawl-urls-20081201/crawldb
See http://hudson.zones.apache.org/hudson/job/Nutch-trunk/650/changes
--
[...truncated 2152 lines...]
A src/plugin/protocol-http/src/test/org/apache/nutch
A src/plugin/protocol-http/src/test/org/apache/nutch/protocol
A src/plugin/prot