Thanks for the helpful replies on this. The data that I am dealing with has the characteristic that I may not be able to/want to load an entire set of counts for <A, *> into memory for some values of A (the curse of Zipfian distributions), so the final "join" step of the process is the tricky part.
As of right now, I'm still having trouble determining how I can force the first element of the set that will be iterated over by a single reducer to be the marginal, and not some individual count. Does anyone know if Hadoop guarantees (can be made to guarantee) that the relative order of keys that are equal will be left unchanged? If so, this would be a fairly easy solution. Thank you! Chris