Carrot2 clustering help

2009-08-10 Thread kazam
Hi there, Has anyone configured carrot2 with nutch 0.8.1. I have the following configuration in nutch-default.xml for carrot2, but everytime I try to acquire the OnlineClusterer class it comes back as null. clusterer = new OnlineClustererFactory(nutchConf).getOnlineClusterer();

Re: Registered plugin never invoked and urls skipped

2009-05-11 Thread kazam
32:23,052 >> > INFO fetcher.Fetcher - fetching >> > >> > >> Thanks, Kenan. >> On Thu, May 7, 2009 at 11:12 PM, Alexander Aristov < >> alexander.aris...@gmail.com> wrote: >> >> > Did you assign mime type to this plugin. What is it? >> &g

Registered plugin never invoked and urls skipped

2009-05-07 Thread kazam
Hi there, I am using nutch-0.8.1 and I have 5 custom plugins that I am using. All of those plugins seem to get used from the logs but one of them is not being used. Also, the urls it was written for are also skipped altogether. Here are some pieces from hadoop.log file 2009-05-07 14:27:41,227 IN

Re: Nutch fetch creates too many http sessions

2009-04-28 Thread kazam
e same domain in > Nutch seems like it might be interesting functionality, I don't believe > that currently exists. My suggestion is to look into tuning websphere > session timeouts. My guess would be they are set to a very high level. > > Dennis > > kazam wrote: &

Nutch fetch creates too many http sessions

2009-04-27 Thread kazam
Hi there, I am generating nutch indexes for our site which is running off a websphere server. The indexing takes about 20 hours to complete. However, after about 15-16 hours the websphere server crashes, because of too many sessions being created. It seems that each fetch creates a new session.

Re: common-terms.utf8 location

2009-03-06 Thread kazam
Any ideas !! kazam wrote: > > Hi there, > Nutch is giving an error to me saying that > > org.apache.hadoop.conf.Configuration common-terms.utf8 not found > > I have tried to specify paths in java using the configuration object. > > > ServletContext applicati

common-terms.utf8 not being found

2009-03-03 Thread kazam
Hi there, For some reason nutch can't seem to find my common-terms.utf8 file. I have placed it under WEB-INF, WEB-INF/classes and even under WEB-INF/lib. In my nutch-default.xml the path to the file is as follows analysis.common.terms.file common-terms.utf8 The name of a file containing