[jira] Commented: (MAPREDUCE-1247) Send out-of-band heartbeat to avoid fake lost tasktracker

2010-08-29 Thread Liyin Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904091#action_12904091
 ] 

Liyin Liang commented on MAPREDUCE-1247:


Hi Guanyin, our product cluster met the same problem. Would you please attach 
your patch file? tks.

> Send out-of-band heartbeat to avoid fake lost tasktracker
> -
>
> Key: MAPREDUCE-1247
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1247
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: ZhuGuanyin
>Assignee: ZhuGuanyin
>
> Currently the TaskTracker report task status to jobtracker through heartbeat, 
> sometimes if the tasktracker  lock the tasktracker to do some cleanup  job, 
> like remove task temp data on disk, the heartbeat thread would hang for a 
> long time while waiting for the lock, so the jobtracker just thought it had 
> lost and would reschedule all its finished maps or un finished reduce on 
> other tasktrackers, we call it "fake lost tasktracker", some times it doesn't 
> acceptable especially when we run some large jobs.  So We introduce a 
> out-of-band heartbeat mechanism to send an out-of-band heartbeat in that case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-2019) Add targets for gridmix unit and system tests in a gridmix build xml file.

2010-08-29 Thread Vinay Kumar Thota (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Kumar Thota updated MAPREDUCE-2019:
-

Attachment: MAPREDUCE-2019.patch

Addressed Ranjit comments. It would run only either unit or system tests of 
gridmix.Because the build.xml file is under gridmix project.

> Add  targets for gridmix unit and system tests in a gridmix build xml file.
> ---
>
> Key: MAPREDUCE-2019
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2019
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: contrib/gridmix
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: MAPREDUCE-2019.patch, MAPREDUCE-2019.patch
>
>
> Add the targets for both unit and system tests in gridmix build xml 
> (src/contrib/gridmix/build.xml). The target name for system tests would be 
> 'test-system' and same way the target name for unit tests would be 'test'.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1668) RaidNode should only Har a directory if all its parity files have been created

2010-08-29 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated MAPREDUCE-1668:


  Status: Resolved  (was: Patch Available)
Hadoop Flags: [Reviewed]
  Resolution: Fixed

I just committed this. Thanks Ram!

> RaidNode should only Har a directory if all its parity files have been created
> --
>
> Key: MAPREDUCE-1668
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1668
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Rodrigo Schmidt
>Assignee: Ramkumar Vadali
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1668.patch
>
>
> In the current code, it can happen that a directory will be Archived (Har'ed) 
> before all its parity files have been generated since parity file generation 
> is not atomic. We should verify if all the parity files are present before 
> Archiving a directory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2035) Enable -Wall and fix warnings in task-controller build

2010-08-29 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904077#action_12904077
 ] 

Allen Wittenauer commented on MAPREDUCE-2035:
-

Surprisingly, there are very few compiler agnostic options.  [In fact, the only 
two that I can think of are -c and -o, and I'm sure something somewhere breaks 
those!]  Removing compiler specific flags from even autoconf files is a pain 
when doing portability work because they tend to sneak in everywhere.

In this particular case, I'm fairly certain you can test for $GCC = yes.   For 
example, for a local patch I have to fix -Wall for g++ I do the following:

# turn -Wall and -strict-prototypes for G++
if test "$GXX" = yes; then
  CXXFLAGS="$CXXFLAGS -Wall -strict-prototypes"
else
  # SunStudio requires -features=extensions
  AC_CACHE_CHECK([whether $CXX accepts -features=extensions],
[ha_cv_cxx__features],
[save_CXXFLAGS=$CXXFLAGS
  CXXFLAGS="$CXXFLAGS -features=extensions"
  AC_LINK_IFELSE([AC_LANG_PROGRAM([], [])],
  [ha_cv_cxx__features=yes],
  [ha_cv_cxx__features=no])
  test "$ha_cv_cxx__features" = no && CXXFLAGS=$save_CXXFLAGS
])
fi



> Enable -Wall and fix warnings in task-controller build
> --
>
> Key: MAPREDUCE-2035
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2035
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task-controller
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: mapreduce-2035-toreview.txt, mapreduce-2035.txt
>
>
> Enabling -Wall shows a bunch of warnings. We should enable them and then fix 
> them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1897) trunk build broken on compile-mapred-test

2010-08-29 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904070#action_12904070
 ] 

Konstantin Boudnik commented on MAPREDUCE-1897:
---

I have ran commit tests:
{noformat}
ant run-commit-test
Buildfile: build.xml
...
run-commit-test:
...
BUILD SUCCESSFUL
Total time: 13 minutes 41 seconds
{noformat}
Patch is ready to be committed in my opinion. If I don't hear otherwise I'll 
commit it by COB Monday.

> trunk build broken on compile-mapred-test
> -
>
> Key: MAPREDUCE-1897
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1897
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.21.0
> Environment: RHEL4 Linux, Java 1.6.0_15-b03
>Reporter: Greg Roelofs
>Assignee: Konstantin Boudnik
> Attachments: MAPREDUCE-1897.patch, MAPREDUCE-1897.patch, 
> MAPREDUCE-1897.patch, MAPREDUCE-1897.patch
>
>
> ...apparently.  Fresh checkout of trunk (all three hadoop-*), 
> build.properties project.version fix, ant veryclean mvn-install of common, 
> hdfs, and then mapreduce:
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:52:
>  cannot access org.apache.hadoop.test.system.DaemonProtocol
> [javac] class file for org.apache.hadoop.test.system.DaemonProtocol not 
> found
> [javac]   static class FakeJobTracker extends JobTracker {
> [javac]  ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:60:
>  non-static variable this cannot be referenced from a static context
> [javac]   this.trackers = tts;
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:60:
>  cannot find symbol
> [javac] symbol  : variable trackers
> [javac] location: class org.apache.hadoop.mapred.FakeObjectUtilities
> [javac]   this.trackers = tts;
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:67:
>  cannot find symbol
> [javac] symbol  : method taskTrackers()
> [javac] location: class 
> org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker
> [javac]   taskTrackers().size() - getBlacklistedTrackerCount(),
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:67:
>  cannot find symbol
> [javac] symbol  : method getBlacklistedTrackerCount()
> [javac] location: class 
> org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker
> [javac]   taskTrackers().size() - getBlacklistedTrackerCount(),
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:68:
>  cannot find symbol
> [javac] symbol  : method getBlacklistedTrackerCount()
> [javac] location: class 
> org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker
> [javac]   getBlacklistedTrackerCount(), 0, 0, 0, totalSlots/2, 
> totalSlots/2, 
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:64:
>  method does not override or implement a method from a supertype
> [javac] @Override
> [javac] ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:73:
>  non-static variable this cannot be referenced from a static context
> [javac]   this.totalSlots = totalSlots;
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:73:
>  cannot find symbol
> [javac] symbol  : variable totalSlots
> [javac] location: class org.apache.hadoop.mapred.FakeObjectUtilities
> [javac]   this.totalSlots = totalSlots;
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestJobInProgress.java:91:
>  establishFirstContact(org.apache.hadoop.mapred.JobTracker,java.lang.String) 
> in org.apache.hadoop.mapred.FakeObjectUtilities cannot be applied to 
> (org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker,java.lang.String)
> [javac]   FakeObjectUtilities.establishFirstContact(jobTracker, 
> s);
> [javac]  ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestJobInProgress.java:170

[jira] Commented: (MAPREDUCE-1897) trunk build broken on compile-mapred-test

2010-08-29 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904068#action_12904068
 ] 

Konstantin Boudnik commented on MAPREDUCE-1897:
---

Apache Hudson patch verification is broken. Thus I have ran {{test-patch.sh}} 
locally:
{noformat}
+1 overall.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 13 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 system tests framework.  The patch passed system tests framework compile.
{noformat}
Test run result is to follow

> trunk build broken on compile-mapred-test
> -
>
> Key: MAPREDUCE-1897
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1897
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.21.0
> Environment: RHEL4 Linux, Java 1.6.0_15-b03
>Reporter: Greg Roelofs
>Assignee: Konstantin Boudnik
> Attachments: MAPREDUCE-1897.patch, MAPREDUCE-1897.patch, 
> MAPREDUCE-1897.patch, MAPREDUCE-1897.patch
>
>
> ...apparently.  Fresh checkout of trunk (all three hadoop-*), 
> build.properties project.version fix, ant veryclean mvn-install of common, 
> hdfs, and then mapreduce:
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:52:
>  cannot access org.apache.hadoop.test.system.DaemonProtocol
> [javac] class file for org.apache.hadoop.test.system.DaemonProtocol not 
> found
> [javac]   static class FakeJobTracker extends JobTracker {
> [javac]  ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:60:
>  non-static variable this cannot be referenced from a static context
> [javac]   this.trackers = tts;
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:60:
>  cannot find symbol
> [javac] symbol  : variable trackers
> [javac] location: class org.apache.hadoop.mapred.FakeObjectUtilities
> [javac]   this.trackers = tts;
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:67:
>  cannot find symbol
> [javac] symbol  : method taskTrackers()
> [javac] location: class 
> org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker
> [javac]   taskTrackers().size() - getBlacklistedTrackerCount(),
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:67:
>  cannot find symbol
> [javac] symbol  : method getBlacklistedTrackerCount()
> [javac] location: class 
> org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker
> [javac]   taskTrackers().size() - getBlacklistedTrackerCount(),
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:68:
>  cannot find symbol
> [javac] symbol  : method getBlacklistedTrackerCount()
> [javac] location: class 
> org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker
> [javac]   getBlacklistedTrackerCount(), 0, 0, 0, totalSlots/2, 
> totalSlots/2, 
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:64:
>  method does not override or implement a method from a supertype
> [javac] @Override
> [javac] ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:73:
>  non-static variable this cannot be referenced from a static context
> [javac]   this.totalSlots = totalSlots;
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:73:
>  cannot find symbol
> [javac] symbol  : variable totalSlots
> [javac] location: class org.apache.hadoop.mapred.FakeObjectUtilities
> [javac]   this.totalSlots = totalSlots;
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestJobInProgress.java:91:
>  establishFirstContact(org.

[jira] Updated: (MAPREDUCE-1670) RAID should avoid policies that scan their own destination path

2010-08-29 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated MAPREDUCE-1670:


  Status: Resolved  (was: Patch Available)
Hadoop Flags: [Reviewed]
  Resolution: Fixed

I just committed this. Thanks Ram.

> RAID should avoid policies that scan their own destination path
> ---
>
> Key: MAPREDUCE-1670
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1670
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Rodrigo Schmidt
>Assignee: Ramkumar Vadali
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1670.patch
>
>
> Raid currently allows policies that include the destination directory into 
> the source directory and vice-versa.
> Both situations can create cycles and should be avoided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-2020) Use new FileContext APIs for all mapreduce components

2010-08-29 Thread Krishna Ramachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krishna Ramachandran updated MAPREDUCE-2020:


Attachment: mapred-2020.patch

First cut. With primary focus on JobTracker, UserLogCleaner and some util 
classes

TaskTracker, JobHistory, CleanUpQueue and other components are "work in 
progress" and not part of this

Initial Goal is:
get initial feedback from mapred and hdfs 
ask for enhancements/fixes from DFS where inadequate/broken
Optimize/eliminate needless RPC calls (exists() checks)
Streamline API calls (eliminate to FileSystem)

refactoring - work in progress

"ant test" did not show any regressions

testpatch output

 [exec]
 [exec] -1 overall.
 [exec]
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] +1 tests included.  The patch appears to include 6 new or 
modified tests.
 [exec]
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] -1 Eclipse classpath. The patch causes the Eclipse classpath to 
differ from the contents of the lib directories.






> Use new FileContext APIs for all mapreduce components 
> --
>
> Key: MAPREDUCE-2020
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2020
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Krishna Ramachandran
>Assignee: Krishna Ramachandran
> Attachments: mapred-2020.patch
>
>
> Migrate mapreduce components to using improved FileContext APIs implemented in
> HADOOP-4952 and 
> HADOOP-6223

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-323) Improve the way job history files are managed

2010-08-29 Thread Dick King (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903991#action_12903991
 ] 

Dick King commented on MAPREDUCE-323:
-

I also fixed a problem with {{TestJobCleanup}} , which without this fix leaves 
files in a temp directory, trashing a subsequent {{TestJobOutputCommitter}} run 
if there is one before the temp directory is cleared.  It's very annoying to 
have tests that fail in a full unit test but not in isolation.

> Improve the way job history files are managed
> -
>
> Key: MAPREDUCE-323
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-323
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Amar Kamat
>Assignee: Dick King
>Priority: Critical
> Attachments: MR323--2010-08-20--1533.patch, 
> MR323--2010-08-25--1632.patch, MR323--2010-08-27--1359.patch, 
> MR323--2010-08-27--1613.patch
>
>
> Today all the jobhistory files are dumped in one _job-history_ folder. This 
> can cause problems when there is a need to search the history folder 
> (job-recovery etc). It would be nice if we group all the jobs under a _user_ 
> folder. So all the jobs for user _amar_ will go in _history-folder/amar/_. 
> Jobs can be categorized using various features like _jobid, date, jobname_ 
> etc but using _username_ will make the search much more efficient and also 
> will not result into namespace explosion. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-323) Improve the way job history files are managed

2010-08-29 Thread Dick King (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dick King updated MAPREDUCE-323:


  Status: Patch Available  (was: Open)
Release Note: 
This patch does four things:

* it changes the directory structure of the done directory that holds 
history logs for jobs that are completed,
* it builds toy databases for completed jobs, so we no longer have to scan 
2N files on DFS to find out facts about the N jobs that have completed since 
the job tracker started [which can be hundreds of thousands of files in 
practical cases],
* it changes the job history browser to display more information and allow 
more filtering criteria, and
* it creates a new programmatic interface for finding files matching 
user-chosen criteria. This allows users to no longer be concerned with our 
methods of storing them, in turn allowing us to change those at will.

The new API described above, which can be used to programmatically obtain 
history file PATHs given search criteria, is described below:

package org.apache.hadoop.mapreduce.jobhistory;
...

// this interface is within O.A.H.mapreduce.jobhistory.JobHistory:

// holds information about one job hostory log in the done 
//   job history logs
public static class JobHistoryJobRecord {
   public Path getPath() { ... }
   public String getJobIDString() { ... }
   public long getSubmitTime() { ... }
   public String getUserName() { ... }
   public String getJobName() { ... }
}

public class JobHistoryRecordRetriever implements 
Iterator {
   // usual Interface methods -- remove() throws 
UnsupportedOperationException
   // returns the number of calls to next() that will succeed
   public int numMatches() { ... }
}

// returns a JobHistoryRecordRetriever that delivers all Path's of job 
matching job history files,
// in no particular order.  Any criterion that is null or the empty string 
does not constrain.
// All criteria that are specified are applied conjunctively, except that 
if there's more than
// one date you retrieve all Path's matching ANY date.
// soughtUser and soughtJobid must match exactly.
// soughtJobName can match the entire job name or any substring.
// dates must be in the format exactly MM/DD/ .  
// Dates' leading digits must be 2's .  We're incubating a Y3K problem.
public JobHistoryRecordRetriever getMatchingJob
(String soughtUser, String soughtJobName, String[] dateStrings, String 
soughtJobid)
  throws IOException 



  was:
This patch does four things:

* it changes the directory structure of the done directory that holds 
history logs for jobs that are completed,
* it builds toy databases for completed jobs, so we no longer have to scan 
2N files on DFS to find out facts about the N jobs that have completed since 
the job tracker started [which can be hundreds of thousands of files in 
practical cases],
* it changes the job history browser to display more information and allow 
more filtering criteria, and
* it creates a new programmatic interface for finding files matching 
user-chosen criteria. This allows users to no longer be concerned with our 
methods of storing them, in turn allowing us to change those at will.

The new API described above, which can be used to programmatically obtain 
history file PATHs given search criteria, is described below:

package org.apache.hadoop.mapreduce.jobhistory;
...

// within JobHistory:

// holds information about one job hostory log in the done 
//   job history logs
public static class JobHistoryJobRecord {
   public Path getPath() { ... }
   public String getJobIDString() { ... }
   public long getSubmitTime() { ... }
   public String getUserName() { ... }
   public String getJobName() { ... }
}

public class JobHistoryRecordRetriever implements 
Iterator {
   // usual Interface methods -- remove() throws 
UnsupportedOperationException
   // returns the number of calls to next() that will succeed
   public int numMatches() { ... }
}

// returns a JobHistoryRecordRetriever that delivers all Path's of job 
matching job history files,
// in no particular order.  Any criterion that is null or the empty string 
does not constrain.
// All criteria that are specified are applied conjunctively, except that 
if there's more than
// one date you retrieve all Path's matching ANY date.
// soughtUser and soughtJobid must match exactly.
// soughtJobName can match the entire job name or any substring.
// dates must be in the format exactly MM/DD/ .  
// Dates' leading digits must be 2's .  We're incubating a Y3K problem.
public JobHistoryRecordRetriever
(String soughtUser, String soughtJobName, String[] dateStrings, String 
soughtJobid)