Looks interesting -- it looks like a specialization for iterative algorithms of a certain kind, a kind which describes a lot of algorithms. Is this distributed? It looked more like it's intended for high-performance machines. I guess it's also different being C++-based and not Hadoop-based.
Hadoop is, in the end, a tool that was never conceived for general distributed computation. But among frameworks it's (relatively) well understood and available. It seems like Mahout has taken on the mission of delivering something that works on the framework that's out there now, which is a practical rather than theoretically-motivated goal. (I think it's a good goal too.) I see that as a difference from many research-oriented projects. Beyond that it is the same sort of thing and that's good. The thing I "worry" most that is being duplicated is actually Pig. It at least gives something more like "primitives" for basic information-shuffling operations on Hadoop like the sorts of pivots and joins and filters that go into your standard implementation of an ML algorithm. I bet we'd find we'd be better off bringing in some stuff from Pig rather than reinvent the join a few times over. But first things first... would really be good to focus on revamping and bringing together what we have already to pull together commonality and such before thinking what we can improve about those commonalities. On Tue, Mar 8, 2011 at 11:07 PM, Shannon Quinn <[email protected]> wrote: > Being the newbie on the block, forgive me if I'm rehashing old news: has > anything seen/heard of GraphLab before? > > http://www.graphlab.ml.cmu.edu/index.html > > It's written by someone who has an office in the same exact building as I > do, just one floor up, so I'll certainly be talking to him soon. But if > there is someone here who is familiar with this work, can you elaborate on > the differences between it and Mahout? He seems to have somewhat tweaked the > standard map/reduce paradigm into something that offers more crosstalk > flexibility between nodes at runtime (at the cost of significant > configurational overhead, most likely), but beyond that it seems strikingly > similar to the functionality Mahout provides. > > Anyway, was pointed to this by someone in my department while I was running > my coalescing thesis ideas by him. > > Shannon >
