[ https://issues.apache.org/jira/browse/MAPREDUCE-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ming Ma updated MAPREDUCE-2779: ------------------------------- Attachment: MAPREDUCE-2779-0.22.patch Here is the patch for 0.22. It passes all unit tests except for known buggy test. [junit] Test org.apache.hadoop.raid.TestRaidNode FAILED Note, the previous patch for trunk is no longer applicable to trunk, given there is a major restructuring in trunk since. > JobSplitWriter.java can't handle large job.split file > ----------------------------------------------------- > > Key: MAPREDUCE-2779 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2779 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: job submission > Affects Versions: 0.20.205.0, 0.22.0, 0.23.0 > Reporter: Ming Ma > Assignee: Ming Ma > Fix For: 0.22.0 > > Attachments: MAPREDUCE-2779-0.22.patch, MAPREDUCE-2779-trunk.patch > > > We use cascading MultiInputFormat. MultiInputFormat sometimes generates big > job.split used internally by hadoop, sometimes it can go beyond 2GB. > In JobSplitWriter.java, the function that generates such file uses 32bit > signed integer to compute offset into job.split. > writeNewSplits > ... > int prevCount = out.size(); > ... > int currCount = out.size(); > writeOldSplits > ... > long offset = out.size(); > ... > int currLen = out.size(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira