I set mapred.reduce.tasks manually to have a single wave of reducers (does that make sense, by the way?)
When I save the data, I often end up with a bunch of small files because we use compression and Hive doesn't seem to merge small compressed files. So my question is: can I disable mapred.reduce.tasks somehow and make Hive use the hive.exec.reducers.bytes.per.reducer instead to reduce the number of output files? It seems the former overrides the latter.