[ https://issues.apache.org/jira/browse/MAPREDUCE-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873160#action_12873160 ]
Dick King commented on MAPREDUCE-323: ------------------------------------- Okay... 1: I will have to fix rumen to recursively descend into a directory of directories to make it capable of swallowing a history directory. 1a: I would like to still process the job IDs in lexicographical order [which is almost always chronological order] for compatibility with applications that expect approximately chronological order. 1b: This creates a memory footprint of about 200b/entry, which may impose a limit of one million jobs or so. 2: I will make the directories configurable. How about the following controls? ||locution||meaning|| |{{%y}} |year [four digits] [The Y10K problem will be someone else's problem :-) ]| |{{%m}} |month [two digits, leading zeros present]| |{{%d}} |day [two digits, leading zeros present]| |{{%h}} |hour [two digits, leading zeros present]| |{{%i}} |mInute [two digits, leading zeros present]| |{{%u}} |user| |{{%xi-j}} |the digits from the jobID index whose positions run from {{i}} through {{j}}, _downwards_, numbered _from the right, 1-based_. If you choose any digits that don't exist you get no characters in the output for those digits. {{%x9-3}} will give you directories holding logs for at most 100 jobs, unless you omit timestamp selection controls.| |{{/}} |directory component separator [even on platforms with a different separator character] -- if there are two or more slashes in a row we swallow all but one, and note that there's an implicit leading and trailing separator character| |any other character |itself| Did I leave anything out? > Improve the way job history files are managed > --------------------------------------------- > > Key: MAPREDUCE-323 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-323 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker > Affects Versions: 0.21.0, 0.22.0 > Reporter: Amar Kamat > Assignee: Dick King > Priority: Critical > > Today all the jobhistory files are dumped in one _job-history_ folder. This > can cause problems when there is a need to search the history folder > (job-recovery etc). It would be nice if we group all the jobs under a _user_ > folder. So all the jobs for user _amar_ will go in _history-folder/amar/_. > Jobs can be categorized using various features like _jobid, date, jobname_ > etc but using _username_ will make the search much more efficient and also > will not result into namespace explosion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.