archive partSize should be configurable ---------------------------------------
Key: MAPREDUCE-1465 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1465 Project: Hadoop Map/Reduce Issue Type: Improvement Components: harchive Reporter: Tsz Wo (Nicholas), SZE Assignee: Mahadev konar The archive part size is current set to 2GB. For archiving 10^5 small files, it took 52 minutes since there is only 1 mapper. {noformat} -bash-3.1$ time $H archive ${Q} -archiveName ${DIR}.3.har -p ${PARENT} ${DIR} ${PARENT} 10/02/06 01:55:14 INFO mapred.JobClient: Running job: job_201002042035_5737 ... 10/02/06 02:47:18 INFO mapred.JobClient: map 100% reduce 100% 10/02/06 02:47:19 INFO mapred.JobClient: Job complete: job_201002042035_5737 ... 10/02/06 02:47:19 INFO mapred.JobClient: Reduce input records=100002 real 52m27.188s user 0m29.314s sys 0m1.276s {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.