I don't know of any other viable alternative at the moment, and I think any alternative would be sufficiently different that it would be hard to meaningfully abstract it away without inventing our own little mapreduce layer. It still doesn't save anyone from thinking about the details of configuring the underlying implementation -- in fact, now they have to worry about configuring Mahout-style mapreduce layer as well.
(In comparison, take a look at something as simple as logging. Through people inventing abstractions, and abstractions on abstractions, it's actually turned into something difficult to manage. Using SL4FJ, putting in the right bindings .jar so it routes through Log4J -- and don't forget log4j.xml -- which you have to use because your dependencies use it, and then, what about that library that will try to select Log4J or Commons on its own, but it's using Commons because it found it in the classpath, and now you don't remember which file configures that, and...) On Mon, Sep 7, 2009 at 8:32 AM, Lukáš Vlček<lukas.vl...@gmail.com> wrote: > Hi, > just a note: Wouldn't it be better to talk about MapReduce as opposed to > Hadoop? This means that for each algorithm implemented in Mahout it should > be clearly stated wheter it is MapReduce based implementation or not (or > using other ways to make it scalable). I can imagine it could be useful to > abstract from Hadoop to the point where it would be possible to use > different MapReduce providers. I am not sure wheter there is any consensus > about how MapReduce interfaces API should look like but Mahout could be a > good candidate for a project to define and create abstract MapReduce API.