Well, my Mahout-0.8-SNAPSHOT is now fine with the analyzer option
"org.apache.lucene.analysis.core.WhitespaceAnalyzer", but there are still
some steps to get over with...
This could be the Hadoop version incompatibility issue and if so, then what
should be the right/minimum Hadoop version? (At least "ClusterDump" with
Mahout-SNAPSHOT-0.8 worked fine against exisiting K-means result previously
done in 0.7)
I've been with Hadoop-0.20.203 (Pseudo-distributed) and Mahout-0.7 for
sometime and have just recently upgraded Mahout side up to 0.8-SNAPSHOT.

$MAHOUT_HOME/bin/mahout seq2sparse --namedVector -i NHTSA-seqfile01/ -o
NHTSA-namedVector -ow -a org.apache.lucene.analysis.core.WhitespaceAnalyzer
-chunk 200 -wt tfidf -s 5 -md 3 -x 90 -ng 2 -ml 50 -seq -n 2
Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB:
/usr/local/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar
13/05/12 01:45:48 INFO vectorizer.SparseVectorsFromSequenceFiles: Maximum
n-gram size is: 2
13/05/12 01:45:48 INFO vectorizer.SparseVectorsFromSequenceFiles: Minimum
LLR value: 50.0
13/05/12 01:45:48 INFO vectorizer.SparseVectorsFromSequenceFiles: Number of
reduce tasks: 1
13/05/12 01:45:48 WARN hdfs.DFSClient: DataStreamer Exception:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar
could only be replicated to 0 nodes, instead of 1
 at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1417)
 at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)

 at org.apache.hadoop.ipc.Client.call(Client.java:1030)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
 at $Proxy1.addBlock(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
 at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
 at $Proxy1.addBlock(Unknown Source)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3104)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2975)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446)

13/05/12 01:45:48 WARN hdfs.DFSClient: Error Recovery for block null bad
datanode[0] nodes == null
13/05/12 01:45:48 WARN hdfs.DFSClient: Could not get block locations.
Source file
"/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar"
- Aborting...
13/05/12 01:45:48 INFO mapred.JobClient: Cleaning up the staging area
hdfs://localhost:9000/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001
Exception in thread "main" org.apache.hadoop.ipc.RemoteException:
java.io.IOException: File
/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar
could only be replicated to 0 nodes, instead of 1
 at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1417)
 at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)

 at org.apache.hadoop.ipc.Client.call(Client.java:1030)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
 at $Proxy1.addBlock(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
 at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
 at $Proxy1.addBlock(Unknown Source)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3104)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2975)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446)
13/05/12 01:45:48 ERROR hdfs.DFSClient: Exception closing file
/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar :
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar
could only be replicated to 0 nodes, instead of 1
 at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1417)
 at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)

org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar
could only be replicated to 0 nodes, instead of 1
 at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1417)
 at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)
 at org.apache.hadoop.ipc.Client.call(Client.java:1030)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
 at $Proxy1.addBlock(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
 at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
 at $Proxy1.addBlock(Unknown Source)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3104)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2975)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446)

Sorry for the long error log.
I believe my Hadoop-0.20.203 is up and running correctly...

$JAVA_HOME/bin/jps
13322 TaskTracker
12985 DataNode
12890 NameNode
13937 Jps
13080 SecondaryNameNode
13219 JobTracker
Hope someone could help this out.
Regards,,
Y.Mandai


2013/5/9 Yutaka Mandai <20525entrad...@gmail.com>

> Suneel
> Great to know.
> Thanks!
> Y.Mandai
>
> iPhoneから送信⌘
>
> On 2013/05/07, at 22:24, Suneel Marthi <suneel_mar...@yahoo.com> wrote:
>
> > It should be
> > org.apache.lucene.analysis.core.WhitespaceAnalyzer ( u were missing the
> 'core')
> >
> > Mahout trunk's presently at Lucene 4.2.1. Lucene's has gone through a
> major refactor in 4.x.
> > Check Lucene 4.2.1 docs for the correct package name.
> >
> >
> >
> >
> > ________________________________
> > From: 万代豊 <20525entrad...@gmail.com>
> > To: "user@mahout.apache.org" <user@mahout.apache.org>
> > Sent: Tuesday, May 7, 2013 3:20 AM
> > Subject: Class Not Found from 0.8-SNAPSHOT for
> org.apache.lucene.analysis.WhitespaceAnalyzer
> >
> >
> > Hi all
> > I guest I must've seen somewhere on very similar topics on classname
> change
> > in Mahout-0.8-SNAPSHOT for some of the Lucene analyzer and here is
> another
> > one that I need to be solved.
> > Mahout gave me an error for seq2sparse with Lucene analyzer option as
> > follows,
> > which of cource had been working in at least Mahout 0.7.
> >
> > $MAHOUT_HOME/bin/mahout seq2sparse --namedVector -i NHTSA-seqfile01/ -o
> > NHTSA-namedVector -ow -a org.apache.lucene.analysis.WhitespaceAnalyzer
> > -chunk 200 -wt tfidf -s 5 -md 3 -x 90 -ng 2 -ml 50 -seq -n 2
> > Running on hadoop, using /usr/local/hadoop/bin/hadoop and
> HADOOP_CONF_DIR=
> > MAHOUT-JOB:
> > /usr/local/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar
> > 13/05/07 15:41:12 INFO vectorizer.SparseVectorsFromSequenceFiles: Maximum
> > n-gram size is: 2
> > 13/05/07 15:41:18 INFO vectorizer.SparseVectorsFromSequenceFiles: Minimum
> > LLR value: 50.0
> > 13/05/07 15:41:18 INFO vectorizer.SparseVectorsFromSequenceFiles: Number
> of
> > reduce tasks: 1
> > Exception in thread "main" java.lang.ClassNotFoundException:
> > org.apache.lucene.analysis.WhitespaceAnalyzer
> > I have confirmed what classpath Mahout is refering to as;
> > $ $MAHOUT_HOME/bin/mahout classpath
> > and obtained Lucene related classpath as below.
> >
> >
> /usr/local/trunk/examples/target/dependency/lucene-analyzers-common-4.2.1.jar
> > /usr/local/trunk/examples/target/dependency/lucene-benchmark-4.2.1.jar:
> > /usr/local/trunk/examples/target/dependency/lucene-core-4.2.1.jar
> > /usr/local/trunk/examples/target/dependency/lucene-facet-4.2.1.jar
> > /usr/local/trunk/examples/target/dependency/lucene-highlighter-4.2.1.jar
> > /usr/local/trunk/examples/target/dependency/lucene-memory-4.2.1.jar
> > /usr/local/trunk/examples/target/dependency/lucene-queries-4.2.1.jar
> > /usr/local/trunk/examples/target/dependency/lucene-queryparser-4.2.1.jar
> > /usr/local/trunk/examples/target/dependency/lucene-sandbox-4.2.1.jar
> >
> > I want to believe this to be simple classname change related issue.
> > Please let me be advised.
> > Regards,,,
> > Y.Mandai
>

Reply via email to