Yes, the Mahout analyzer would have to be updated for Lucene 4.0. I
suggest using an earlier one. Mahout uses with Lucene in a very simple
way, and it is OK to use any earlier Lucene from 3.1 to 3.6.

On Wed, Jul 18, 2012 at 11:50 PM, Videnova, Svetlana
<svetlana.viden...@logica.com> wrote:
> Hi Sean,
>
> In fact i was using lucene version 3.6.0 (saw that in the pom.xml)
> But in my classpath I was using lucene version 4.0.0
>
> I change pom.xml to 4.0.0 => <lucene.version>4.0.0</lucene.version>
>
> But still the same error:
> ###
> Exception in thread "main" java.lang.VerifyError: class 
> org.apache.mahout.vectorizer.DefaultAnalyzer overrides final method 
> tokenStream.(Ljava/lang/String;Ljava/io/Reader;)Lorg/apache/lucene/analysis/TokenStream;
> ###
>
> Should I change something else? Or may be lucene 4.0 is too recent for 
> mahout!?
>
>
>
> Thank you
>
> -----Message d'origine-----
> De : Sean Owen [mailto:sro...@gmail.com]
> Envoyé : mercredi 18 juillet 2012 22:52
> À : user@mahout.apache.org
> Objet : Re: .txt to vector
>
> This means you're using it with an incompatible version of Lucene. I think 
> we're on 3.1. Check the version that Mahout depends upon and use at least 
> that version or later.
>
> On Wed, Jul 18, 2012 at 6:04 PM, Videnova, Svetlana < 
> svetlana.viden...@logica.com> wrote:
>
>> I'm working with mahout. I'm trying to do web service in java by
>> myself who will take the output of solr and give this file to mahout.
>> For the moment I successfully do the recommendation part.
>> Now I'm trying to clusterise. For this I have to vectorise the output
>> of solr.
>> Do you have any idea how to do it please? I was following
>> https://cwiki.apache.org/MAHOUT/creating-vectors-from-text.html
>> BUT : doesn't work very well (at all...).
>>
>> I'm trying to find how to transform .txt to vector for mahout in order
>> to clusterise and categorise my information. Is it possible? I saw
>> that I have to use seqdirectory And seq2sparse.
>>
>> Seqdirectory create a file (with some numbers and everything...) this
>> step is ok But then when I have to use seq2sparse that gives me this
>> error:
>>
>> csi@csi-SCENIC-W:/usr/local/apache-mahout-d6d6ee8$ ./bin/mahout
>> seq2sparse --input ./examples/output/ --output ./toto/output/ hadoop
>> binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
>> locally
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in
>> [jar:file:/usr/local/apache-mahout-d6d6ee8/examples/target/mahout-exam
>> ples-0.8-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in
>> [jar:file:/usr/local/apache-mahout-d6d6ee8/examples/target/dependency/
>> slf4j-jcl-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in
>> [jar:file:/usr/local/apache-mahout-d6d6ee8/examples/target/dependency/
>> slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> explanation.
>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>> 12/07/18 15:53:33 INFO vectorizer.SparseVectorsFromSequenceFiles:
>> Maximum n-gram size is: 1
>> 12/07/18 15:53:33 INFO vectorizer.SparseVectorsFromSequenceFiles:
>> Minimum LLR value: 1.0
>> 12/07/18 15:53:33 INFO vectorizer.SparseVectorsFromSequenceFiles:
>> Number of reduce tasks: 1 Exception in thread "main"
>> java.lang.VerifyError: class
>> org.apache.mahout.vectorizer.DefaultAnalyzer overrides final method
>> tokenStream.(Ljava/lang/String;Ljava/io/Reader;)Lorg/apache/lucene/analysis/TokenStream;
>>                 at java.lang.ClassLoader.defineClass1(Native Method)
>>                 at
>> java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
>>                 at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
>>                 at
>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
>>                 at
>> java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
>>                 at
>> java.net.URLClassLoader.access$000(URLClassLoader.java:58)
>>                 at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
>>                 at java.security.AccessController.doPrivileged(Native
>> Method)
>>                 at
>> java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>                 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>>                 at
>> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>>                 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>>                 at
>> org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.run(SparseVectorsFromSequenceFiles.java:199)
>>                 at
>> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>                 at
>> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>                 at
>> org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:55)
>>                 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>> Method)
>>                 at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>                 at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>                 at java.lang.reflect.Method.invoke(Method.java:597)
>>                 at
>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>                 at
>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>                 at
>> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
>>
>> im using only lucene 4.0!
>>
>> CLASSPATH=/opt/lucene-4.0.0-ALPHA/demo/lucene-demo-4.0.0-ALPHA.jar:/opt/lucene-4.0.0-ALPHA/core/lucene-core-4.0.0-ALPHA.jar:/opt/lucene-4.0.0-ALPHA/analysis/common/lucene-analyzers-common-4.0.0-ALPHA.jar:/opt/lucene-4.0.0-ALPHA/queryparser/lucene-queryparser-4.0.0-ALPHA.jar:.
>>
>> Please where im wrong?
>>
>>
>> Thank you all
>> Regards
>>
>>
>>
>>
>>
>>
>> Think green - keep it on the screen.
>>
>> This e-mail and any attachment is for authorised use by the intended
>> recipient(s) only. It may contain proprietary material, confidential
>> information and/or be subject to legal privilege. It should not be
>> copied, disclosed to, retained or used by, any other party. If you are
>> not an intended recipient then please promptly delete this e-mail and
>> any attachment and all copies and inform the sender. Thank you.
>>
>>
>
> Think green - keep it on the screen.
>
> This e-mail and any attachment is for authorised use by the intended 
> recipient(s) only. It may contain proprietary material, confidential 
> information and/or be subject to legal privilege. It should not be copied, 
> disclosed to, retained or used by, any other party. If you are not an 
> intended recipient then please promptly delete this e-mail and any attachment 
> and all copies and inform the sender. Thank you.
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to