This has been a very interesting. Thank you for sharing this Stephen. Your posts on EMR and Mahout are motivating me to get off the golf course and badminton courts and re-energize my efforts with EC2 and EMR.
When we advance this a bit further, it would be great to use Amazon Simple Message Service to move data to S3 from live systems. I apologize for lurking and lazy. I need to get off my "you know what" and contribute more. On Thu, Apr 16, 2009 at 10:13 PM, Stephen Green <[email protected]> wrote: > > On Apr 16, 2009, at 11:10 AM, Jeff Eastman wrote: > >> Hi Stephen, >> >> It looks to me like you are on the right track. The original kMeans code >> and job patterns were written over a year ago, probably on a version of >> Hadoop 10 or 11 IIRC. They have made significant changes to the file system >> in the interim and nobody - except you - has tried to run kMeans on EMR. > > A man, alone... This would make a good movie! > >> It certainly seems like dfs.exists(outPath) should be false if you have no >> outPath. You have a sharp machete and are making good progress breaking a >> jungle trail to EMR. > > I'll see if I can get the s3n thing working locally and then return to spend > a few more dimes at Amazon. Once I figure it out, I'll add an issue to Jira > and see if I can work up a patch for mahout-dev. > >> If you'd like to chat on the phone or Skype, please contact me directly >> (jeff at windwardsolutions dot com). > > Thanks! > > Steve > -- > Stephen Green // [email protected] > Principal Investigator \\ http://blogs.sun.com/searchguy > Aura Project // Voice: +1 781-442-0926 > Sun Microsystems Labs \\ Fax: +1 781-442-1692 > > > >
