map output transfers of more than 2^31 bytes output are failing
---------------------------------------------------------------
Key: HADOOP-1452
URL: https://issues.apache.org/jira/browse/HADOOP-1452
Project: Hadoop
Issue Type: Bug
Components: mapred
Affects Versions: 0.13.0
Reporter: Christian Kunz
Symptom:
WARN org.apache.hadoop.mapred.ReduceTask: java.io.IOException: Incomplete map
output received for
http://<host>:50060/mapOutput?map=task_0026_m_000298_0&reduce=61 (2327458761
instead of 2327347307)
WARN org.apache.hadoop.mapred.ReduceTask: task_0026_r_000061_0 adding host
<host> to penalty box, next contact in 263 seconds
Besides failing to fetch data, the reduce will retry forever. This should be
limited.
Source of the problem:
in mapred/TaskTracker.java the variable totalRead keeping track what is sent to
the reducer should be declared as long:
...
int totalRead = 0;
int len = mapOutputIn.read(buffer, 0,
partLength < MAX_BYTES_TO_READ
? (int)partLength : MAX_BYTES_TO_READ);
while (len > 0) {
try {
outStream.write(buffer, 0, len);
outStream.flush();
} catch (IOException ie) {
isInputException = false;
throw ie;
}
totalRead += len;
if (totalRead == partLength) break;
len = mapOutputIn.read(buffer, 0,
(partLength - totalRead) < MAX_BYTES_TO_READ
? (int)(partLength - totalRead) :
MAX_BYTES_TO_READ);
}
...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.