Raghu Angadi wrote:
Torsten Curdt wrote:
I just checked our mapred.submit.replication and it is higher than the
nodes in the cluster - maybe that's the problem?
This pretty much assures at least a few of these exceptions.
So we have a workaround: lower mapred.submit.replication. And it's
arguably not a bug, but just a misfeature, since it only causes spurious
warnings.
One fix might be to try to determine mapred.submit.replication based on
the cluster size. But that was contentious when that feature was added,
and I'd rather not re-open that argument again now.
You can argue that Namenode should not schedule a block to a node
twice.. and I agree.
That sounds like a good thing to fix. Should we file a bug?
Doug