[Nutch Wiki] Update of "RunNutchInEclipse0.9" by PiotrBazan

2008-12-03 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The following page has been changed by PiotrBazan: http://wiki.apache.org/nutch/RunNutchInEclipse0%2e9 -- *

[jira] Updated: (NUTCH-386) Plugin to index categories by url rules

2008-12-03 Thread Beaucarnea (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Beaucarnea updated NUTCH-386: - Attachment: index-url-category.jar This plugin uses the deprecated org.apache.hadoop.io.UTF8 which caused

Re: RSS-fecter and index individul-how can i realize this function

2008-12-03 Thread mirkes
Where can I find Scott's solution? I am trying to do it exactly like Scott, but i cannot imagine how to index items separately. Please, can anybody help me? Many thanks Miro sdeck wrote: > > So, here is what I do for RSS Feeds. > > I parse the rss, and for each outlink, I create the outlink

readlinkdb fails to dump linkdb

2008-12-03 Thread brainstorm
Using nutch 0.9 (hadoop 0.17.1): [EMAIL PROTECTED] working]$ bin/nutch readlinkdb /home/hadoop/crawl-20081201/crawldb -dump crawled_urls.txt LinkDb dump: starting LinkDb db: /home/hadoop/crawl-urls-20081201/crawldb java.io.IOException: Type mismatch in value from map: expected org.apache.nutch.cra

Re: readlinkdb fails to dump linkdb

2008-12-03 Thread Doğacan Güney
On Wed, Dec 3, 2008 at 8:55 PM, brainstorm <[EMAIL PROTECTED]> wrote: > Using nutch 0.9 (hadoop 0.17.1): > > [EMAIL PROTECTED] working]$ bin/nutch readlinkdb > /home/hadoop/crawl-20081201/crawldb -dump crawled_urls.txt > LinkDb dump: starting > LinkDb db: /home/hadoop/crawl-urls-20081201/crawldb

Build failed in Hudson: Nutch-trunk #650

2008-12-03 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Nutch-trunk/650/changes -- [...truncated 2152 lines...] A src/plugin/protocol-http/src/test/org/apache/nutch A src/plugin/protocol-http/src/test/org/apache/nutch/protocol A src/plugin/prot