[ https://issues.apache.org/jira/browse/MAPREDUCE-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Konstantin Shvachko updated MAPREDUCE-1248: ------------------------------------------- Affects Version/s: 0.22.0 Assignee: Ruibang He > Redundant memory copying in StreamKeyValUtil > -------------------------------------------- > > Key: MAPREDUCE-1248 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1248 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/streaming > Affects Versions: 0.22.0 > Reporter: Ruibang He > Assignee: Ruibang He > Priority: Minor > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1248-v1.0.patch > > > I found that when MROutputThread collecting the output of Reducer, it calls > StreamKeyValUtil.splitKeyVal() and two local byte-arrays are allocated there > for each line of output. Later these two byte-arrays are passed to variable > key and val. There are twice memory copying here, one is the > System.arraycopy() method, the other is inside key.set() / val.set(). > This causes double times of memory copying for the whole output (may lead to > higher CPU consumption), and frequent temporay object allocation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira