[jira] [Commented] (MAHOUT-524) DisplaySpectralKMeans example fails

Jeff Eastman (JIRA) Wed, 17 Aug 2011 14:07:50 -0700

    [ 
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086595#comment-13086595
 ]


Jeff Eastman commented on MAHOUT-524:
-------------------------------------

The original example was extracting 5 eigenvectors and thus returned 5-d 
results. I changed it to extract 2 vectors and it used to run but displayed 
incorrect results.

I'm (still since pre 0.5 testing, IIRC) getting a FileNotFoundException in the 
bowels of DRM.times while running this in local Hadoop mode. I wonder if it is 
possible to add a --method sequential implementation for SpectralKMeans to help 
separate the algorithmetic issues from the file bookkeeping ones?

We have a sequential Lanczos implementation...

Exception in thread "main" java.lang.IllegalStateException: 
java.io.FileNotFoundException: File 
file:/home/dev/workspace/mahout/examples/output/calculations/laplacian-33/tmp/data
 does not exist.
        at 
org.apache.mahout.math.hadoop.DistributedRowMatrix.times(DistributedRowMatrix.java:222)
        at 
org.apache.mahout.math.decomposer.lanczos.LanczosSolver.solve(LanczosSolver.java:104)
        at 
org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.runJob(DistributedLanczosSolver.java:72)
        at 
org.apache.mahout.clustering.spectral.kmeans.SpectralKMeansDriver.run(SpectralKMeansDriver.java:155)
        at 
org.apache.mahout.clustering.display.DisplaySpectralKMeans.main(DisplaySpectralKMeans.java:72)
Caused by: java.io.FileNotFoundException: File 
file:/home/dev/workspace/mahout/examples/output/calculations/laplacian-33/tmp/data
 does not exist.
        at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:371)
        at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
        at 
org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:51)
        at 
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:211)
        at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:929)
        at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:921)
        at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:838)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:791)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
        at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:791)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:765)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1200)
        at 
org.apache.mahout.math.hadoop.DistributedRowMatrix.times(DistributedRowMatrix.java:214)
        ... 4 more


> DisplaySpectralKMeans example fails
> -----------------------------------
>
>                 Key: MAHOUT-524
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-524
>             Project: Mahout
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 0.4, 0.5
>            Reporter: Jeff Eastman
>            Assignee: Jeff Eastman
>              Labels: clustering, k-means, visualization
>             Fix For: 0.6
>
>         Attachments: aff.txt, raw.txt, spectralkmeans.png
>
>
> I've committed a new display example that attempts to push the standard 
> mixture of models data set through spectral k-means. After some tweaking of 
> configuration arguments and a bug fix in EigenCleanupJob it runs spectral 
> k-means to completion. The display example is expecting 2-d clustered points 
> and the example is producing 5-d points. Additional I/O work is needed before 
> this will play with the rest of the clustering algorithms. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-524) DisplaySpectralKMeans example fails

Reply via email to