[ 
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102380#comment-13102380
 ] 

Shannon Quinn edited comment on MAHOUT-524 at 9/12/11 12:05 AM:
----------------------------------------------------------------

I've been tooling around with this code for a few hours now and cannot figure 
out where the pesky "/data" is being appended to the overall path...or why the 
second Path that Lance mentioned isn't what is actually being used. It has to 
be somewhere in the Lanczos solver code (filtering into the 
DistributedRowMatrix and its TimesSquaredJob, as the latter is what is actually 
causing the exception), but in all my searching and println()-ing of paths I 
can't seem to find it.

Just prior to the TimesSquaredJob kicking off, the Lanczos solver outputs that 
is it "Finding 4 singular vectors", followed by this output:

11/09/11 19:28:42 INFO mapred.FileInputFormat: Total input paths to process : 2

which is very confusing to me, since in the TimesSquaredJob 
"createTimesSquaredJobConf()" method, there is only one invocation of 
FileInputFormat.addInputPath(). This mysterious second input path may very well 
be the cause of the problems, but again I just can't seem to find where it's 
added.

I'm going to keep looking, but any help in finding this bug would be greatly 
appreciated.

      was (Author: magsol):
    I've been tooling around with this code for a few hours now and cannot 
figure out where the pesky "/data" is being appended to the overall path...or 
why the second Path that Lance mentioned isn't what is actually being used. It 
has to be somewhere in the Lanczos solver code (filtering into the 
DistributedRowMatrix and its TimesSquaredJob, as the latter is what is actually 
causing the exception), but in all my searching and println()-ing of paths I 
can't seem to find it.

I'm going to keep looking, but any help in finding this bug would be greatly 
appreciated.
  
> DisplaySpectralKMeans example fails
> -----------------------------------
>
>                 Key: MAHOUT-524
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-524
>             Project: Mahout
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 0.4, 0.5
>            Reporter: Jeff Eastman
>            Assignee: Shannon Quinn
>              Labels: clustering, k-means, visualization
>             Fix For: 0.6
>
>         Attachments: aff.txt, raw.txt, spectralkmeans.png
>
>
> I've committed a new display example that attempts to push the standard 
> mixture of models data set through spectral k-means. After some tweaking of 
> configuration arguments and a bug fix in EigenCleanupJob it runs spectral 
> k-means to completion. The display example is expecting 2-d clustered points 
> and the example is producing 5-d points. Additional I/O work is needed before 
> this will play with the rest of the clustering algorithms. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to