You must stop using Mahout 0.5 and switch to using Mahout 0.8 or 0.9, the 
reasons being:-

a)  Mahout 0.5 is past its shelf life and has been purged from all Apache 
mirrors and hence is not available for download.
b)  Mahout 0.5 was using Lucene 3.x.  Mahout 0.8 and above use Lucene 4.x, 
Lucene 4.x is not backward compatible with Lucene 3.x; most of Lucene packages 
and classes have been refactored with faster and leaner indexes. 


The issue u r seeing is due Lucene 3.x jars missing from your classpath, add 
lucene-core-3.5.jar to ur classpath and u should be good.







On Wednesday, February 5, 2014 9:05 AM, Sznajder ForMailingList 
<bs4mailingl...@gmail.com> wrote:
 
Hi
I am using the Mahout 0.5 and I would like to use the EnglishAnalyzer for
running Kmeans.

However, when running the following command, I get an exception:

bin/mahout seq2sparse -i logs-seqFiles/ -o
log-vectors-monogram-englishanalyzer -ow -s 1 -a
org.apache.lucene.analysis.en.EnglishAnalyzer


I get

Exception in thread "main" java.lang.InstantiationException:
org.apache.lucene.analysis.en.EnglishAnalyzer
        at java.lang.J9VMInternals.newInstanceImpl(Native Method)
        at java.lang.Class.newInstance(Class.java:1375)
        at
org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.run(SparseVectorsFromSequenceFiles.java:198)

How can I add this Analyzer to the path?

Benjamin

Reply via email to