[ https://issues.apache.org/jira/browse/MAPREDUCE-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ming Ma updated MAPREDUCE-2779: ------------------------------- Affects Version/s: 0.20.205.0 Status: Patch Available (was: Open) > JobSplitWriter.java can't handle large job.split file > ----------------------------------------------------- > > Key: MAPREDUCE-2779 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2779 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: job submission > Affects Versions: 0.20.205.0, 0.22.0, 0.23.0 > Reporter: Ming Ma > Attachments: MAPREDUCE-2779-trunk.patch > > > We use cascading MultiInputFormat. MultiInputFormat sometimes generates big > job.split used internally by hadoop, sometimes it can go beyond 2GB. > In JobSplitWriter.java, the function that generates such file uses 32bit > signed integer to compute offset into job.split. > writeNewSplits > ... > int prevCount = out.size(); > ... > int currCount = out.size(); > writeOldSplits > ... > long offset = out.size(); > ... > int currLen = out.size(); -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira