Hi David,
I tried to find a solution to visualize, too. In DisplayClustering example,
clustering is running for x,y vectors and easy to visualize. But in real
world, we have n vectors. It's not possible to visualize as point in xy
chart, I think.
On Thu, Dec 12, 2013 at 6:42 PM, David G wrote:
Hi,
Is it possible to get dirichlet clustering result with named vector as in
k-means?
with named vector in k-means, I can get files located in clusters.
, and normPower option set to -1.0f. This applies to
> HighDFWordsPruner.pruneVectors, too.
>
> I believe that solves your problem.
>
> Best
>
> Gokhan
>
>
> On Wed, Sep 4, 2013 at 4:54 PM, Taner Diler wrote:
>
> > Actually, my real motivation was to visualize r
With 0.8 you can set conf files as
Configuration conf = new Configuration();
String HADOOP_HOME = System.getenv("HADOOP_PREFIX");
conf.addResource(new Path(HADOOP_HOME, "conf/core-site.xml"));
conf.addResource(new Path(HADOOP_HOME, "conf/hdfs-site.xml"));
conf.addR
un seq2sparse. I'm gonna debug it anyway.
>
> And I would like to know how you run the java code. Does your main class
> extend AbstractJob to make it "runnable" using bin/mahout? And does it have
> a main method that submits your job to your hadoop cluster? Are you us
Value: 2
Key: 3: Value: 2
Key: 4: Value: 9
Key: 5: Value: 4
dictionary.file-0
Key class: class org.apache.hadoop.io.Text Value Class: class
org.apache.hadoop.io.IntWritable
Key: 0: Value: 0
Key: 0.003: Value: 1
Key: 0.006913: Value: 2
Key: 0.007050: Value: 3
Key: 0.01: Value: 4
Key: 0.02: Value: 5
Key: 0.025
mahout seq2sparse -i reuters-seqfiles/ -o reuters-kmeans-try -chunk 200 -wt
tfidf -s 2 -md 5 -x 95 -ng 2 -ml 50 -n 2 -seq
this command works well.
Gokhan, I changed minLLR value to 1.0 in java but result is same empty
tfidf-vectors.
On Tue, Sep 3, 2013 at 10:47 AM, Taner Diler wrote
if that works
> well?
>
> On Sun, Sep 1, 2013 at 7:23 PM, Suneel Marthi >wrote:
>
> > I would first check to see if the input 'seqfiles' for TFIDFGenerator
> have
> > any meat in them.
> > This could also happen if the input seqfiles are empty.
>
>
>
Hi all,
How can I visualize Reuters KMeans Clustering as in DisplayKMeans.java?
Thanks.
Hi all,
I try to run Reuters KMeans example in Java, but TFIDFComverter generates
tfidf-vectors as empty. How can I fix that?
private static int minSupport = 2;
private static int maxNGramSize = 2;
private static float minLLRValue = 50;
private static float normPower = 2;
priv
Hi all,
I try to cluster texts with dirichlet. I have few questions about the
result:
1. How can I display data and clusters in a chart like in DisplayDirichlet
example. In DisplayDirichlet, sample data has x,y value, It can be
displayed. But in TF-IDF result, one file has many word frequency vec
After converting reuters sgm files to txt formant in (reuters-extracted),
on the first mahout command seqdirectory, you should give input path as
file:///your_dir/reuters-extracted. If you give input parameter as
/your_dir/reuters-extracted, I got same problem on k-means clustering.
On Mon, Jul 2
Hi all,
I want to be sure about a subject.
I've lots of articles about sports, mobile technologies, beverage & food,
automotive...
When I take a new article, system should tell me that this is about
beverage & food.
Classification is doing this, am I right? Is there a sample or tutorial
about l
I'm getting "*No input clusters found in
reuters-kmeans-clusters/part-randomSeed. Check your -c argument*" while
running k-means example on "mahout in action" sample. I searched on google,
but I didnt find a solution. I'm using mahout 0.7 version. How can I run
k-means clustering?
command:
taner
14 matches
Mail list logo