DistributedLanczosSolver has been deprecated (and the blog post u mention is
old). Use Stochastic SVD (SSVD) instead.
On Friday, December 20, 2013 12:41 AM, Partha Pratim Talukdar
partha.taluk...@cs.cmu.edu wrote:
Hello,
I am running mahout (v0.8) svd over a sparse matrix of size
Really interesting, I would like to have that in Paris :)
On 20 December 2013 07:47, Michael Wechner michael.wech...@wyona.comwrote:
Hi
Are you also considering to tell this (or maybe a shorter version) at
ApacheCon?
Thanks
Michael
Am 20.12.13 03:50, schrieb Koji Sekiguchi:
I'm
Uh, interesting idea that I've never thought.
Sorry but I don't have a plan to go to ApacheCon.
koji
(13/12/20 15:47), Michael Wechner wrote:
Hi
Are you also considering to tell this (or maybe a shorter version) at
ApacheCon?
Thanks
Michael
Am 20.12.13 03:50, schrieb Koji
Suneel and others,
I am still getting the strange results when I do the tour. Suneel: I
manually wiped out the temp folder and also deleted the reuters-XXX
folders.
Also, per your advice I added the -ow option to all of the commands.
NOTE: The step to create a matrix would NOT take a -ow option
Sorry Scott I should have looked at this more closely. I apologize.
1. You are doing a seqdumper of the matrix (which is generated from the rowid
job and is not the output of the rowsimilarity job).
Rowid Job generates a MxN matrix where M - no. of documents and N - terms
associated with
Hi All,
I was able to do the clustering and need some help with viewing the result. I
get the following problem.
./mahout clusterdump -i /scratch/dummyvectoroutput/clusters-*-final -d
/scratch/dummyvectorfinalclusters
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Warning:
Hi All,
I was able to resolve this issue by adding the following to my code:
DistributedCache.addFileToClassPath(new Path(/scratch/mahout-math-0.9-\
SNAPSHOT.jar), conf,fs);
DistributedCache.addFileToClassPath(new Path(/scratch/mahout-core-0.9-\
SNAPSHOT.jar), conf,fs);
Are you working off of trunk? 'clusterdump' is being used in
examples/bin/cluster-reuters.sh.
On Friday, December 20, 2013 5:33 PM, Sameer Tilak ssti...@live.com wrote:
Hi All,
I was able to do the clustering and need some help with viewing the result. I
get the following problem.
On Dec 20, 2013, at 2:35pm, Sameer Tilak ssti...@live.com wrote:
Hi All,
I was able to resolve this issue by adding the following to my code:
DistributedCache.addFileToClassPath(new
Path(/scratch/mahout-math-0.9-\
SNAPSHOT.jar), conf,fs);
Suneel:
Yes, I am working off of trunk. I saw that example. In my case the data is
numeric -- I assume that means no need for dictionary etc . I am not sure what
is going on, but I still get the following errors:
./mahout clusterdump -i /scratch/dummyvectoroutput/clusters-*-final -o
Hi Ken,
Thanks. I was going through that route. I was wondering if there is any
advantage approach that uses Tool and call ToolRunner.run() over the one that
uses DistributedCache.addFileToClassPath. May be the former one is more generic
and can help you with things other than adding jar files.
I would investigate all of those 'Unable to add .' messages first. Checkout
the latest code and run a clean build.
On Friday, December 20, 2013 5:58 PM, Sameer Tilak ssti...@live.com wrote:
Suneel:
Yes, I am working off of trunk. I saw that example. In my case the data is
numeric -- I
Hi All,
My HADOOP_CLASSPATH was interfering somehow. Things seem to work fine now.
-bash-4.1$ export HADOOP_CLASSPATH=
./mahout clusterdump -i /scratch/dummyvectoroutput/clusters-*-final --pointsDir
/scratch/clusterdump
MAHOUT-JOB:
Suneel,
Thank you for your help. :) Thought I was completely in the ditch.
If you are interested: inline with you comments are demonstrations that I
finally have it (and the commands that I used)….
YAQ (Yet another question):
How do I see with the dumper the documents that belong in a given
What does the data in cdump.txt represent? Can you point me in the right
direction?
SCott
On 12/20/13 4:30 PM, Suneel Marthi suneel_mar...@yahoo.com wrote:
Sorry Scott I should have looked at this more closely. I apologize.
1. You are doing a seqdumper of the matrix (which is generated from
You could use clusterdump to see the output of your clusters.
Eg:
$MAHOUT clusterdump \
-i ${WORK_DIR}/reuters-kmeans/clusters-*-final \
-o ${WORK_DIR}/reuters-kmeans/clusterdump \
-d ${WORK_DIR}/reuters-out-seqdir-sparse-kmeans/dictionary.file-0 \
-dt sequencefile -b 100 -n
Which cdump.txt ?
On Friday, December 20, 2013 7:29 PM, Suneel Marthi suneel_mar...@yahoo.com
wrote:
You could use clusterdump to see the output of your clusters.
Eg:
$MAHOUT clusterdump \
-i ${WORK_DIR}/reuters-kmeans/clusters-*-final \
-o
Suneel,
I think I have it :)
Pls confirm this understanding:
I'm looking at the cdump.out that comes from clusterdump. It has the 20
clusters, each of the top words in the cluster, and each of the vectors
that are members of the cluster. Do I have it? Am I getting this?
Thanks,
SCott
You got it.
On Friday, December 20, 2013 7:36 PM, Scott C. Cote scottcc...@gmail.com
wrote:
Suneel,
I think I have it :)
Pls confirm this understanding:
I'm looking at the cdump.out that comes from clusterdump. It has the 20
clusters, each of the top words in the cluster, and each of
19 matches
Mail list logo