Hi,
Sometime I get output of cluster dumper of Kmeans with colon
0~~~0~~~VL-0{n=147408 c=[1032.927, 17.964, 11.384, 11.384] r=[10245.867,
761.066, 62.758, 62.758]}
1~~~1~~~VL-1{n=6 c=[0:2859913.130, 1:561.007] r=[0:366747.921, 1:1189.343]}
2~~~2~~~VL-2{n=3 c=[5335512.995, 96.320, 4.709, 4.709]
Hi Konstantin,
Good to hear from you.
The link you mentioned points to EigenSeedGenerator not
RandomSeedGenerator. The problem seems to be with the call to
fs.getFileStatus(input).isDir()
It's been a while and I don't remember but perhaps you have to set
additional Hadoop fs properties to use
Hi Peyman,
good to hear from u. Not sure if anyone's responded to u yet, but the answer
to ur question is I am not aware of any bench marking that was done for
#Mahout's CVB impl. Others please jump in here if you think otherwise.
What has changed in LDA from 0.7 - 0.9?
- 0.7 had LDA
I specifically have fixed mapreduce jobs by doing what the error message
suggests.
But maybe (hopefully) there is another workaround that is configuration driven.
Just a hunch but, Maybe mahout needs to be refactored to create fs objects
using the get(uri,conf) calls?
As hadoop evolves to
Another wild guess, I've had issues trying to use the 's3' protocol from Hadoop
and got things working by using the 's3n' protocol instead.
On Mar 16, 2014, at 8:41 AM, Jay Vyas jayunit...@gmail.com wrote:
I specifically have fixed mapreduce jobs by doing what the error message
suggests.
I've also encountered a similar error once. It's really just the
FileSystem.get call that needs to be modified. I think its a good idea
to walk through the codebase and refactor this where necessary.
--sebastian
On 03/16/2014 05:16 PM, Andrew Musselman wrote:
Another wild guess, I've had
I agree best to be explicit when creating filesystem instances by using the two
argument get(...). it's time to update it filesystem 2.0 Apis. Can you file a
Jira for this ? If not I will :)
On Mar 16, 2014, at 12:37 PM, Sebastian Schelter s...@apache.org wrote:
I've also encountered a
Yes, there are ways to debug.
One is to attach a remote debugger, set breakpoints, etc., like so:
https://www.google.com/search?q=attach+remote+debugger+java+example+eclipse+or+intellij
The other would be to write a log4j.properties file for Mahout and/or
Hadoop and set the logging level to more