[Nutch-general] Why datanode does not work properly on slave?

2007-06-07 Thread Ilya Vishnevsky
Hello! I'm deploying Nutch on two computers. When I run start-all.sh script all goes good but data node on slave computer does not log anything. All other parts of Hadoop (namenode, jobtracker, both tasktrackers and datanode on master) log their information properly. Also, when I put some files

Re: [Nutch-general] Why datanode does not work properly on slave?

2007-06-07 Thread Ilya Vishnevsky
I've just changed dfs.replication property in the hadoop-site.xml from 1 to 2. Now I get ArithmeticException: / by zero when try to put something into dfs. Also I want to say that both nodes are on Windows. -Original Message- From: Ilya Vishnevsky [mailto:[EMAIL PROTECTED] Sent

[Nutch-general] no datanode to stop

2007-06-05 Thread Ilya Vishnevsky
Hello! I'm trying to deploy Nutch on two machines (the slave's IP is for example 66.66.66.66). After all I type: bin/start-all.sh And get the following output: starting namenode, logging to /cygdrive/c/nutch/testdfs/logs/hadoop-nutch-namenode-Thanatos.out localhost: starting datanode,

[Nutch-general] NoClassDefFoundError while trying to run (format) namenode

2007-06-01 Thread Ilya Vishnevsky
Hello I'm trying to deploy Nutch and Hadoop like it's shown here: http://wiki.apache.org/nutch/NutchHadoopTutorial The difference is that I use Win XP. After typing in Cygwin bin/hadoop namenode -format I get the following exceptions: Exception in thread main java.lang.NoClassDefFoundError:

[Nutch-general] Nutch on Windows. ssh: command not found

2007-05-30 Thread Ilya Vishnevsky
Hello. I try to run shell scripts starting Nutch. I use Windows XP, so I installed cygwin. When I execute bin/start-all.sh, I get following messages: localhost: /cygdrive/c/nutch/nutch-0.9/bin/slaves.sh: line 45: ssh: command not found localhost: /cygdrive/c/nutch/nutch-0.9/bin/slaves.sh: line

[Nutch-general] SegmentReader - (1 to retrieve), infinite loop.

2007-05-18 Thread Ilya Vishnevsky
Using SegmentReader I encountered the next problem. While trying to read content for an URL from segment I got the next log: INFO 2007-05-18 03:26:58,562 [main] SegmentReader - SegmentReader: get 'http://www.casdn.neu.edu/~etam/military/FM19-30/9309ch.pdf' DEBUG 2007-05-18 03:27:03,593 [main]

[Nutch-general] SequenceFile.Reader. Access denied

2007-05-15 Thread Ilya Vishnevsky
Hi. I'm trying to read the crawl_generate sequence file from my Nutch: SequenceFile.Reader reader = new SequenceFile.Reader (fs, genPath, conf); Here is the exception I get when I try to run my code: java.io.FileNotFoundException: C:\webdb\aco\segments\20070515133433\crawl_generate (Access is

[Nutch-general] Adding documents to already created distributed index

2007-04-26 Thread Ilya Vishnevsky
Hi all! As I understand Nutch creates distributed index in Hadoop called Indexes while indexing fetched segments. Then it merges these Indexes into one Index in local file system. We use parts of Nutch in our project. We want to use only distributed index (Indexes). The problem is that we want to

[Nutch-general] How to reIndex after reCrawl?

2007-04-26 Thread Ilya Vishnevsky
Another question on the similiar subject: For example I made a recrawl and found some new pages. I want to add them to my Index. I use Indexer to create Indexes. How could I add this Indexes into my already existing Index now? If I just merge Indexes, I'll lose documents placed in my old Index.

[Nutch-general] Lucene IndexWriter and Nutch index

2007-03-22 Thread Ilya Vishnevsky
Hello! I'm trying to write a document to an existing index. I've created the following method: public void testIndexWriter () { Analyzer analyzer = new StandardAnalyzer(); Document testDoc = new Document (); testDoc.add(new Field(apple, apple, Field.Store.YES,

Re: [Nutch-general] Lucene IndexWriter and Nutch index

2007-03-22 Thread Ilya Vishnevsky
I'm sorry, there is FsDirectory in my method, not DCFsDirectory. -Original Message- From: Ilya Vishnevsky [mailto:[EMAIL PROTECTED] Sent: Thursday, March 22, 2007 4:42 PM To: nutch-user@lucene.apache.org Subject: Lucene IndexWriter and Nutch index Hello! I'm trying to write a document