Xiaowei Zhu created HDFS-10543: ---------------------------------- Summary: hdfsRead read stops at block boundary Key: HDFS-10543 URL: https://issues.apache.org/jira/browse/HDFS-10543 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xiaowei Zhu
Reproducer: char *buf2 = new char[file_info->mSize]; memset(buf2, 0, (size_t)file_info->mSize); int ret = hdfsRead(fs, file, buf2, file_info->mSize); delete [] buf2; if(ret != file_info->mSize) { std::stringstream ss; ss << "tried to read " << file_info->mSize << " bytes. but read " << ret << " bytes"; ReportError(ss.str()); hdfsCloseFile(fs, file); continue; } When it runs with a file ~1.4GB large, it will return an error like "tried to read 1468888890 bytes. but read 134217728 bytes". The HDFS cluster it runs against has a block size of 1468888890 bytes. So it seems hdfsRead will stop at a block boundary. Looks like a regression. We should add retry to continue reading cross blocks in case of files w/ multiple blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org