Hello Hadoop Community, Given the tremendous positive feedback we've all had regarding the HDFS, MapReduce, and Common project split, I'd like to propose we take the next step and further separate the existing projects.
I propose we begin by splitting the MapReduce project into separate "Map" and "Reduce" sub-projects. This will provide us the opportunity to tease out the complex interdependencies between "map" and "reduce" that exist today, to encourage us to write more modular and isolated code, which should speed releases. This will also aid our users who exclusively run map-only or reduce-only jobs. These are important use-cases, and so should be given high priority. Given that these two portions of the existing MapReduce project share a great deal of code, we will likely need to release these two new projects concurrently at first, but the eventual goal should certainly be to be able to release "Map" and "Reduce" independently. This seems intuitive to me, given the remarkable recent advancements in the academic community regarding "reduce," while the research coming out of the "map" academics has largely stagnated of late. If this proposal is accepted, and it has the success I think it will, then we should strongly consider splitting the other two projects as well. My gut instinct is that we should split "HDFS" into "HD" and "FS" sub-projects, and simply rename the "Common" project to "C'Mon." We can think about the details of what exactly these project splits mean later. Please let me know what you think. Best, Aaron