Stack, That section was written by Doug after he and I had the same debate man moons ago. While I can't say with absolute certainty that you shouldn't use a reducer, I can say is that every situation where I have seen a M/R where you are writing to HBase, you end up not wanting to use a reducer. If you want a clear and concise statement you can say that the rule of thumb is that you don't want to use a reducer and that cases where you would need to first use a reducer are the rare exception.
The reason I ask people to think about this topic is that unless you have a really good foundation in databases, not relying on a reducer is a bit counter intuitive. (Which is why I said that you really need to clear your mind and focus on this issue. ) -Mike PS. If you care to read the thread, I didn't become condescending until a certain individual piped up about how refactoring the M/R was a 'distraction' to the issue at hand. Not to mention his flip response w the Google paper? On May 10, 2012, at 4:57 PM, Stack wrote: > On Thu, May 10, 2012 at 11:59 AM, Michael Segel > <michael_se...@hotmail.com> wrote: >> Sigh. >> >> Dave, >> I really think you need to think more about the problem. >> >> Think about what a reduce does and then think about what happens in side of >> HBase. >> >> Then think about which runs faster... a job with two mappers writing the >> intermediate and final results in HBase, >> or a M/R job that writes its output to HBase. >> >> If you really truly think about the problem, you will start to understand >> why I say you really don't want to use a reducer when you're working w HBase. >> > > We have a bit of doc that usually you might want to forego reduce > phase, > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#sink. > Do we need to add to it? That said, you can't make an hard and fast > rule that the reduce is to be avoided absolutely. There will be cases > where it makes sense (MR sort orthogonal to HBase's or a fat > aggregating reduce, etc.) > > St.Ack > P.S. Hey Michael. Go easy on the 'sighs'. The participants in this > thread have a clue. I can testify to that. Also, I know you don't > mean it, but on occasion, both in this thread and in others I've seen > you on, your tone can come across as condescending (and there is > nothing like condescension for raising the rankles). We all have our > style's but you might want to review with this in mind before you hit > send the next time. Just a suggestion. >