On Thu, Aug 2, 2012 at 11:57 AM, Videnova, Svetlana
svetlana.viden...@logica.com wrote:
Hello,
I’m doing java app for clustering my data with kmeans.
Those are the steps:
1)
LuceneDemo : Create index and vectors using lib Lucene.vector, input
path of my .txt, output index (segments_1
AM, Videnova, Svetlana
svetlana.viden...@logica.com wrote:
Hi,
My goal is to transform the vectors created by lucene.vector (thanks
to kmeans clustering) to a human readable format. For that I am using
ClusterDumper function on eclipse. But that code does not generate
none files. What
-08-2012 12:50, Videnova, Svetlana wrote:
I already generated points directory when i run cluster (kmeans in my case).
But for the moment I can't generate clustedump because of error on this line:
ClusterDumper.readPoints(new Path(output/kmeans/clusters-0), 2,
conf); Second parameter is double
Hello,
I’m doing java app for clustering my data with kmeans.
Those are the steps:
1)
LuceneDemo : Create index and vectors using lib Lucene.vector, input path of my
.txt, output index (segments_1, segments.gen, .fdt, .fdx, .fnm, .frq, .nrm,
.prx, .tii, .tis, .tvd, .tvx and the most
Hello,
I'm using mahout 0.7 and trying to clusterise, but apparently there is no more
KMeansClusterer class available in 0.7.
Can somebody please tell me by which class kmeansclusterer is replaced?
Thank you
Think green - keep it on the screen.
This e-mail and any attachment is for
and
generic way.
KMeansDriver's run method is all you need to use KMeans Clustering.
On 02-08-2012 15:25, Videnova, Svetlana wrote:
Hello,
I'm using mahout 0.7 and trying to clusterise, but apparently there is no
more KMeansClusterer class available in 0.7.
Can somebody please tell me by which
Hi mahouters,
I am trying to use the mahout lib with my app java.
But while I try to clusterize calling this:
DocumentProcessor.tokenizeDocuments(new
Path(inputDir),analyzer.getClass().asSubclass(Analyzer.class), tokenizedPath,
conf);
And this:
InputDriver.runJob(new Path(inputDir),
cluster?
please copy you code and try to run it on hadoop cluseter.
-- 原始邮件 --
发件人: Videnova, Svetlana;
发送时间: 2012年7月31日(星期二) 下午3:27
收件人: user@mahout.apache.org;
主题: mahout lib : permissions
Hi mahouters,
I am trying to use the mahout lib with my app
I'm using cygwin. Permissions problems was beacause I wasn’t using aparantly
cygwin. Thanks all.
But I have still this error. What about jobs problems?
Exception in thread main java.lang.IllegalStateException: Job failed!
at
.
This should work. I am still not sure why it doesn't work with the direct
download version..
Thanks,
Kiran
On Tue, Jul 31, 2012 at 8:30 AM, Videnova, Svetlana
svetlana.viden...@logica.com wrote:
I'm using cygwin. Permissions problems was beacause I wasn’t using
aparantly cygwin. Thanks all
artichokes 14 0
cheese 17 1
deron 14 2
french 14 3
fries 14 4
hamburger 14 5
nicole 17 6
salad 17 7
steak 14 8
-Message d'origine-
De : Videnova, Svetlana [mailto:svetlana.viden...@logica.com]
Envoyé : mercredi 25 juillet
Hi everybody,
Is it possible instead of creating a vector from txt or lucene index creating
vector from streaming (looking like xml)?
stream example:
response
lst name=responseHeader
int name=status0/int
int name=QTime16/int
lst name=params
str name=indenton/str
str name=start0/str
str
. This
percentage is expressed
as a value between 0 and
1. The default is 0.
You want .3, not 30 !
On Tue, Jul 24, 2012 at 1:27 AM, Videnova, Svetlana
term vectors.
http://code.google.com/p/luke/
It uses Swing, so you need the index on your local PC.
On Wed, Jul 25, 2012 at 12:15 AM, Videnova, Svetlana
svetlana.viden...@logica.com wrote:
Yes i saw the help thats why I was trying with something between 0 and 1 but
I have all the time
to vector
It is a jar file, so just java -jar luke.jar
But, there's a problem. Luke releases are keyed to different Lucene releases.
You need the right Luke download for your version of Lucene.
http://code.google.com/p/luke/downloads/list
On Wed, Jul 25, 2012 at 12:52 AM, Videnova, Svetlana
in the index with
the suffix .tvf. This has the data which the Mahout lucene job looks for.
On Mon, Jul 23, 2012 at 8:03 AM, Videnova, Svetlana
svetlana.viden...@logica.com wrote:
Hello again,
I have got my indexed files from solr in windows and copy them into a
directory in ubuntu
java.lang.IllegalArgumentException
-Message d'origine-
De : Videnova, Svetlana [mailto:svetlana.viden...@logica.com]
Envoyé : mardi 24 juillet 2012 09:16
À : user@mahout.apache.org
Objet : RE: .txt to vector
Hi Lance,
My dir contains now : _0.tvf and the others.
With the command:
apache-mahout-d6d6ee8
/TermVectorComponent
I don't know if lucene.vector is in the Mahout 0.5 release.
For cluster outputs, the current cluster dumper supports 'graphml'
format. Giraph is an interactive graph browsers. You can look at small cluster
jobs.
On Thu, Jul 19, 2012 at 11:34 PM, Videnova, Svetlana
svetlana.viden
: Videnova, Svetlana [mailto:svetlana.viden...@logica.com]
Envoyé : lundi 23 juillet 2012 10:18
À : user@mahout.apache.org
Objet : RE: .txt to vector
I'm using mahout on ubuntu and solr on windows i guess with a web service I can
get the indexed files from solr and then thanks to java program In the web
programs in Mahout, and otherwise covers
other text processing problems.
Mahout in Action is very good, and can help you use most of the Mahout features.
http://www.manning.com/owen
http://www.manning.com/ingersoll
On Thu, Jul 19, 2012 at 8:08 AM, Videnova, Svetlana
svetlana.viden...@logica.com
That's a very good question, I was expecting an answer too...
That was the answer giver to me from mahout users:
the type of input and output depends on the job you want to run.
I was clustering .txt files for the moment.
-Message d'origine-
De : shriram [mailto:ghai12...@gmail.com]
at 6:04 PM, Videnova, Svetlana
svetlana.viden...@logica.com wrote:
I'm working with mahout. I'm trying to do web service in java by
myself who will take the output of solr and give this file to mahout.
For the moment I successfully do the recommendation part.
Now I'm trying to clusterise
Objet : Re: .txt to vector
Yes, the Mahout analyzer would have to be updated for Lucene 4.0. I suggest
using an earlier one. Mahout uses with Lucene in a very simple way, and it is
OK to use any earlier Lucene from 3.1 to 3.6.
On Wed, Jul 18, 2012 at 11:50 PM, Videnova, Svetlana
svetlana.viden
file:/usr/local/apache-mahout-d6d6ee8/examples/output/clusters-8/data does not
exist.
Best Regards
Alexander Aristov
On 19 July 2012 12:30, Videnova, Svetlana svetlana.viden...@logica.comwrote:
Hi Lance,
Thank you for your fast answer.
I was changing my :
CLASSPATH=/opt/lucene-3.6.0/lucene
-vectors tokenized-documents
How should the vectors files looking like?
And can somebody please explain me what represents each directory of the output
above?
Thank you
-Message d'origine-
De : Videnova, Svetlana [mailto:svetlana.viden...@logica.com]
Envoyé : jeudi 19 juillet 2012 14
is the chunk-0 file exactly?
What represent clusters-dump at the end created by using the command
clusterdump?
Thank you all!
-Message d'origine-
De : Videnova, Svetlana [mailto:svetlana.viden...@logica.com]
Envoyé : jeudi 19 juillet 2012 15:07
À : user@mahout.apache.org
Objet : RE: .txt
I'm working with mahout. I'm trying to do web service in java by myself who
will take the output of solr and give this file to mahout. For the moment I
successfully do the recommendation part.
Now I'm trying to clusterise. For this I have to vectorise the output of solr.
Do you have any idea how
Memory: 67M/170M
:):):):):):):):):):)
Then thanks to : Sean Owen and his updates on
http://zoekja.nl/proxy/index.php?q=aHR0cHM6Ly9naXRodWIuY29tL2FwYWNoZS9tYWhvdXQ%3D
-Message d'origine-
De : Videnova, Svetlana [mailto:svetlana.viden...@logica.com]
Envoyé : vendredi
Hello,
I'm trying to run the first example of ch02 of Mahout in action.
I have got following errors.
Did I have to create the pom.xml.
If yes: What I have to put in? Where I have to put it?
If no: Where can I find it? Cause apparently maven did not find it.
Where can I find taste files of
29 matches
Mail list logo