And if you do this, you could in fact make the MahoutMapper/Reducer classes be serializable/writable, and do fun things like have classes (something like DistributedDataSet) which have methods like dataSet.mapReduce(myMahoutMapper, myMahoutReducer), which returns a transformed DistributedDataSet. Configuration data which would be too big to fit inside of a Configuration object, could be side-channelled via the DistributedCache...
On Tue, May 25, 2010 at 1:18 PM, Robin Anil <[email protected]> wrote: > Algorithms implement Mahout Mapper/Reducer interfaces. Abstract Job sets up > jobs by calling Hadoop function. There will be only one Hadoop > Mapper/Reducer class which will take a Mahout Mapper/Reducer during setup > and call the Map function inside it > > Robin >
