[ https://issues.apache.org/jira/browse/MAPREDUCE-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902271#action_12902271 ]
Krishna Ramachandran commented on MAPREDUCE-323: ------------------------------------------------ few comments to start with: *JobHistory.java* {quote} private static final SortedMap<Long, String>jobToDirectoryMap = new TreeMap<Long, String>(); {quote} how is this used? {quote} public String getConfFilePath(JobID jobId) { MovedFileInfo info = jobHistoryFileMap.get(jobId); if (info == null) { return null; } final Path historyFileDir = (new Path(getHistoryFilePath(jobId))).getParent(); return getConfFile(historyFileDir, jobId).toString(); } {quote} instead "info" has this data? _info.historyFile_ ? suggest simple modification to setupEventWriter {quote} public void setupEventWriter(JobID jobId, JobConf jobConf) throws IOException { if (logDir == null) { LOG.info("Log Directory is null, returning"); throw new IOException("Missing Log Directory for History"); } MetaInfo oldFi = fileMap.get(jobId); long submitTime = (oldFi == null ? System.currentTimeMillis() : oldFi.submitTime); String user = getUserName(jobConf); String jobName = getJobName(jobConf); .... {quote} On ThreadPoolExecutor - why increased pool size? {quote} canonicalHistoryLogDir(JobId,...) {quote} jobId is not used in the following {quote} canonicalHistoryLogDir( {quote} In this block {quote} synchronized (ueState) { ...... + iShouldMonitor = true; + + ueState.unindexedElements = new LinkedList<JobHistoryIndexElement>(); + ueState.currentDoneSubdirectory = resultDir; + + ueState.monitoredDirectory = resultDir; ..... + ueState.unindexedElements. + add(new JobHistoryIndexElement(millisecondTime, id, metaInfo)); {quote} This code is not enitrely clear. should we increment the count here? unindexedElementCount++ _unindexedElements_ related item: get/addUnindexedElements() - who calls these? In class UnindexedElementsState.closeCurrentDirectory() {quote} OutputStream newIndexOStream = null; PrintStream newIndexPStream = null; {quote} are unused {quote} + // time, because iShouldMonitor is only set true when + // ueState.monitoredDirectory changes, which will force the + // current incumbent to abend at the earliest opportunity. + while (iShouldMonitor) { + int roundCounter = 0; + + int interruptionsToAbort = 2; + + try { + Thread.sleep(1000); + } catch (InterruptedException e) { + if (--interruptionsToAbort == 0) { + return; + } + } + + synchronized (ueState) { + if (ueState.monitoredDirectory != resultDir) { + // someone else closed out the directory I was monitoring + iShouldMonitor = false; + } else if (++roundCounter % 30 == 0) { + interruptionsToAbort = 2; + {quote} is in a busy wait loop with an arbitrary 1 sec sleep. This check can go up to a maximum of 1 hour? The 5 minute checkpoint does not set anything? {quote} } else if (++roundCounter % 300 == 0) { // called for side effect -- a 5 minute checkpoint try { ueState.getACurrentIndex(ueState.currentDoneSubdirectory); // why? } catch (IOException e) { LOG.warn("Couldn't build an interim Job History index for " + ueState.currentDoneSubdirectory); } {quote} > Improve the way job history files are managed > --------------------------------------------- > > Key: MAPREDUCE-323 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-323 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker > Affects Versions: 0.21.0, 0.22.0 > Reporter: Amar Kamat > Assignee: Dick King > Priority: Critical > Attachments: MR323--2010-08-20--1533.patch > > > Today all the jobhistory files are dumped in one _job-history_ folder. This > can cause problems when there is a need to search the history folder > (job-recovery etc). It would be nice if we group all the jobs under a _user_ > folder. So all the jobs for user _amar_ will go in _history-folder/amar/_. > Jobs can be categorized using various features like _jobid, date, jobname_ > etc but using _username_ will make the search much more efficient and also > will not result into namespace explosion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.