[ https://issues.apache.org/jira/browse/MAPREDUCE-3991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Harsh J updated MAPREDUCE-3991: ------------------------------- Resolution: Fixed Fix Version/s: 0.23.3 0.24.0 Target Version/s: (was: 0.23.3, 0.24.0) Status: Resolved (was: Patch Available) Committed to branch-0.23 and trunk. > Streaming FAQ has some wrong instructions about input files splitting > --------------------------------------------------------------------- > > Key: MAPREDUCE-3991 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3991 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: documentation > Affects Versions: 0.23.0 > Reporter: Harsh J > Assignee: Harsh J > Priority: Trivial > Fix For: 0.24.0, 0.23.3 > > Attachments: MAPREDUCE-3991.patch, MAPREDUCE-3991.patch > > > Steaming docs say, at: > http://hadoop.apache.org/common/docs/current/streaming.html#How+do+I+process+files%2C+one+per+map%3F > "Generate a file containing the full HDFS path of the input files. Each map > task would get one file name as input." > This is incorrect, as a file isn't split by lines, rather by size - for MR. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira