[jira] Commented: (MAPREDUCE-1247) Send out-of-band heartbeat to avoid fake lost tasktracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904091#action_12904091 ] Liyin Liang commented on MAPREDUCE-1247: Hi Guanyin, our product cluster met the same problem. Would you please attach your patch file? tks. > Send out-of-band heartbeat to avoid fake lost tasktracker > - > > Key: MAPREDUCE-1247 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1247 > Project: Hadoop Map/Reduce > Issue Type: New Feature >Reporter: ZhuGuanyin >Assignee: ZhuGuanyin > > Currently the TaskTracker report task status to jobtracker through heartbeat, > sometimes if the tasktracker lock the tasktracker to do some cleanup job, > like remove task temp data on disk, the heartbeat thread would hang for a > long time while waiting for the lock, so the jobtracker just thought it had > lost and would reschedule all its finished maps or un finished reduce on > other tasktrackers, we call it "fake lost tasktracker", some times it doesn't > acceptable especially when we run some large jobs. So We introduce a > out-of-band heartbeat mechanism to send an out-of-band heartbeat in that case. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-2019) Add targets for gridmix unit and system tests in a gridmix build xml file.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay Kumar Thota updated MAPREDUCE-2019: - Attachment: MAPREDUCE-2019.patch Addressed Ranjit comments. It would run only either unit or system tests of gridmix.Because the build.xml file is under gridmix project. > Add targets for gridmix unit and system tests in a gridmix build xml file. > --- > > Key: MAPREDUCE-2019 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2019 > Project: Hadoop Map/Reduce > Issue Type: Task > Components: contrib/gridmix >Reporter: Vinay Kumar Thota >Assignee: Vinay Kumar Thota > Attachments: MAPREDUCE-2019.patch, MAPREDUCE-2019.patch > > > Add the targets for both unit and system tests in gridmix build xml > (src/contrib/gridmix/build.xml). The target name for system tests would be > 'test-system' and same way the target name for unit tests would be 'test'. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1668) RaidNode should only Har a directory if all its parity files have been created
[ https://issues.apache.org/jira/browse/MAPREDUCE-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated MAPREDUCE-1668: Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Resolution: Fixed I just committed this. Thanks Ram! > RaidNode should only Har a directory if all its parity files have been created > -- > > Key: MAPREDUCE-1668 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1668 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/raid >Affects Versions: 0.22.0 >Reporter: Rodrigo Schmidt >Assignee: Ramkumar Vadali > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1668.patch > > > In the current code, it can happen that a directory will be Archived (Har'ed) > before all its parity files have been generated since parity file generation > is not atomic. We should verify if all the parity files are present before > Archiving a directory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-2035) Enable -Wall and fix warnings in task-controller build
[ https://issues.apache.org/jira/browse/MAPREDUCE-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904077#action_12904077 ] Allen Wittenauer commented on MAPREDUCE-2035: - Surprisingly, there are very few compiler agnostic options. [In fact, the only two that I can think of are -c and -o, and I'm sure something somewhere breaks those!] Removing compiler specific flags from even autoconf files is a pain when doing portability work because they tend to sneak in everywhere. In this particular case, I'm fairly certain you can test for $GCC = yes. For example, for a local patch I have to fix -Wall for g++ I do the following: # turn -Wall and -strict-prototypes for G++ if test "$GXX" = yes; then CXXFLAGS="$CXXFLAGS -Wall -strict-prototypes" else # SunStudio requires -features=extensions AC_CACHE_CHECK([whether $CXX accepts -features=extensions], [ha_cv_cxx__features], [save_CXXFLAGS=$CXXFLAGS CXXFLAGS="$CXXFLAGS -features=extensions" AC_LINK_IFELSE([AC_LANG_PROGRAM([], [])], [ha_cv_cxx__features=yes], [ha_cv_cxx__features=no]) test "$ha_cv_cxx__features" = no && CXXFLAGS=$save_CXXFLAGS ]) fi > Enable -Wall and fix warnings in task-controller build > -- > > Key: MAPREDUCE-2035 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2035 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: task-controller >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Attachments: mapreduce-2035-toreview.txt, mapreduce-2035.txt > > > Enabling -Wall shows a bunch of warnings. We should enable them and then fix > them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1897) trunk build broken on compile-mapred-test
[ https://issues.apache.org/jira/browse/MAPREDUCE-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904070#action_12904070 ] Konstantin Boudnik commented on MAPREDUCE-1897: --- I have ran commit tests: {noformat} ant run-commit-test Buildfile: build.xml ... run-commit-test: ... BUILD SUCCESSFUL Total time: 13 minutes 41 seconds {noformat} Patch is ready to be committed in my opinion. If I don't hear otherwise I'll commit it by COB Monday. > trunk build broken on compile-mapred-test > - > > Key: MAPREDUCE-1897 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1897 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Affects Versions: 0.21.0 > Environment: RHEL4 Linux, Java 1.6.0_15-b03 >Reporter: Greg Roelofs >Assignee: Konstantin Boudnik > Attachments: MAPREDUCE-1897.patch, MAPREDUCE-1897.patch, > MAPREDUCE-1897.patch, MAPREDUCE-1897.patch > > > ...apparently. Fresh checkout of trunk (all three hadoop-*), > build.properties project.version fix, ant veryclean mvn-install of common, > hdfs, and then mapreduce: > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:52: > cannot access org.apache.hadoop.test.system.DaemonProtocol > [javac] class file for org.apache.hadoop.test.system.DaemonProtocol not > found > [javac] static class FakeJobTracker extends JobTracker { > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:60: > non-static variable this cannot be referenced from a static context > [javac] this.trackers = tts; > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:60: > cannot find symbol > [javac] symbol : variable trackers > [javac] location: class org.apache.hadoop.mapred.FakeObjectUtilities > [javac] this.trackers = tts; > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:67: > cannot find symbol > [javac] symbol : method taskTrackers() > [javac] location: class > org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker > [javac] taskTrackers().size() - getBlacklistedTrackerCount(), > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:67: > cannot find symbol > [javac] symbol : method getBlacklistedTrackerCount() > [javac] location: class > org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker > [javac] taskTrackers().size() - getBlacklistedTrackerCount(), > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:68: > cannot find symbol > [javac] symbol : method getBlacklistedTrackerCount() > [javac] location: class > org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker > [javac] getBlacklistedTrackerCount(), 0, 0, 0, totalSlots/2, > totalSlots/2, > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:64: > method does not override or implement a method from a supertype > [javac] @Override > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:73: > non-static variable this cannot be referenced from a static context > [javac] this.totalSlots = totalSlots; > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:73: > cannot find symbol > [javac] symbol : variable totalSlots > [javac] location: class org.apache.hadoop.mapred.FakeObjectUtilities > [javac] this.totalSlots = totalSlots; > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestJobInProgress.java:91: > establishFirstContact(org.apache.hadoop.mapred.JobTracker,java.lang.String) > in org.apache.hadoop.mapred.FakeObjectUtilities cannot be applied to > (org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker,java.lang.String) > [javac] FakeObjectUtilities.establishFirstContact(jobTracker, > s); > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestJobInProgress.java:170
[jira] Commented: (MAPREDUCE-1897) trunk build broken on compile-mapred-test
[ https://issues.apache.org/jira/browse/MAPREDUCE-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904068#action_12904068 ] Konstantin Boudnik commented on MAPREDUCE-1897: --- Apache Hudson patch verification is broken. Thus I have ran {{test-patch.sh}} locally: {noformat} +1 overall. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 system tests framework. The patch passed system tests framework compile. {noformat} Test run result is to follow > trunk build broken on compile-mapred-test > - > > Key: MAPREDUCE-1897 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1897 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Affects Versions: 0.21.0 > Environment: RHEL4 Linux, Java 1.6.0_15-b03 >Reporter: Greg Roelofs >Assignee: Konstantin Boudnik > Attachments: MAPREDUCE-1897.patch, MAPREDUCE-1897.patch, > MAPREDUCE-1897.patch, MAPREDUCE-1897.patch > > > ...apparently. Fresh checkout of trunk (all three hadoop-*), > build.properties project.version fix, ant veryclean mvn-install of common, > hdfs, and then mapreduce: > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:52: > cannot access org.apache.hadoop.test.system.DaemonProtocol > [javac] class file for org.apache.hadoop.test.system.DaemonProtocol not > found > [javac] static class FakeJobTracker extends JobTracker { > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:60: > non-static variable this cannot be referenced from a static context > [javac] this.trackers = tts; > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:60: > cannot find symbol > [javac] symbol : variable trackers > [javac] location: class org.apache.hadoop.mapred.FakeObjectUtilities > [javac] this.trackers = tts; > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:67: > cannot find symbol > [javac] symbol : method taskTrackers() > [javac] location: class > org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker > [javac] taskTrackers().size() - getBlacklistedTrackerCount(), > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:67: > cannot find symbol > [javac] symbol : method getBlacklistedTrackerCount() > [javac] location: class > org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker > [javac] taskTrackers().size() - getBlacklistedTrackerCount(), > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:68: > cannot find symbol > [javac] symbol : method getBlacklistedTrackerCount() > [javac] location: class > org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker > [javac] getBlacklistedTrackerCount(), 0, 0, 0, totalSlots/2, > totalSlots/2, > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:64: > method does not override or implement a method from a supertype > [javac] @Override > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:73: > non-static variable this cannot be referenced from a static context > [javac] this.totalSlots = totalSlots; > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:73: > cannot find symbol > [javac] symbol : variable totalSlots > [javac] location: class org.apache.hadoop.mapred.FakeObjectUtilities > [javac] this.totalSlots = totalSlots; > [javac] ^ > [javac] > /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestJobInProgress.java:91: > establishFirstContact(org.
[jira] Updated: (MAPREDUCE-1670) RAID should avoid policies that scan their own destination path
[ https://issues.apache.org/jira/browse/MAPREDUCE-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated MAPREDUCE-1670: Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Resolution: Fixed I just committed this. Thanks Ram. > RAID should avoid policies that scan their own destination path > --- > > Key: MAPREDUCE-1670 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1670 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/raid >Affects Versions: 0.22.0 >Reporter: Rodrigo Schmidt >Assignee: Ramkumar Vadali > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1670.patch > > > Raid currently allows policies that include the destination directory into > the source directory and vice-versa. > Both situations can create cycles and should be avoided. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-2020) Use new FileContext APIs for all mapreduce components
[ https://issues.apache.org/jira/browse/MAPREDUCE-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krishna Ramachandran updated MAPREDUCE-2020: Attachment: mapred-2020.patch First cut. With primary focus on JobTracker, UserLogCleaner and some util classes TaskTracker, JobHistory, CleanUpQueue and other components are "work in progress" and not part of this Initial Goal is: get initial feedback from mapred and hdfs ask for enhancements/fixes from DFS where inadequate/broken Optimize/eliminate needless RPC calls (exists() checks) Streamline API calls (eliminate to FileSystem) refactoring - work in progress "ant test" did not show any regressions testpatch output [exec] [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 6 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] -1 Eclipse classpath. The patch causes the Eclipse classpath to differ from the contents of the lib directories. > Use new FileContext APIs for all mapreduce components > -- > > Key: MAPREDUCE-2020 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2020 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.22.0 >Reporter: Krishna Ramachandran >Assignee: Krishna Ramachandran > Attachments: mapred-2020.patch > > > Migrate mapreduce components to using improved FileContext APIs implemented in > HADOOP-4952 and > HADOOP-6223 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-323) Improve the way job history files are managed
[ https://issues.apache.org/jira/browse/MAPREDUCE-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903991#action_12903991 ] Dick King commented on MAPREDUCE-323: - I also fixed a problem with {{TestJobCleanup}} , which without this fix leaves files in a temp directory, trashing a subsequent {{TestJobOutputCommitter}} run if there is one before the temp directory is cleared. It's very annoying to have tests that fail in a full unit test but not in isolation. > Improve the way job history files are managed > - > > Key: MAPREDUCE-323 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-323 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 0.21.0, 0.22.0 >Reporter: Amar Kamat >Assignee: Dick King >Priority: Critical > Attachments: MR323--2010-08-20--1533.patch, > MR323--2010-08-25--1632.patch, MR323--2010-08-27--1359.patch, > MR323--2010-08-27--1613.patch > > > Today all the jobhistory files are dumped in one _job-history_ folder. This > can cause problems when there is a need to search the history folder > (job-recovery etc). It would be nice if we group all the jobs under a _user_ > folder. So all the jobs for user _amar_ will go in _history-folder/amar/_. > Jobs can be categorized using various features like _jobid, date, jobname_ > etc but using _username_ will make the search much more efficient and also > will not result into namespace explosion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-323) Improve the way job history files are managed
[ https://issues.apache.org/jira/browse/MAPREDUCE-323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dick King updated MAPREDUCE-323: Status: Patch Available (was: Open) Release Note: This patch does four things: * it changes the directory structure of the done directory that holds history logs for jobs that are completed, * it builds toy databases for completed jobs, so we no longer have to scan 2N files on DFS to find out facts about the N jobs that have completed since the job tracker started [which can be hundreds of thousands of files in practical cases], * it changes the job history browser to display more information and allow more filtering criteria, and * it creates a new programmatic interface for finding files matching user-chosen criteria. This allows users to no longer be concerned with our methods of storing them, in turn allowing us to change those at will. The new API described above, which can be used to programmatically obtain history file PATHs given search criteria, is described below: package org.apache.hadoop.mapreduce.jobhistory; ... // this interface is within O.A.H.mapreduce.jobhistory.JobHistory: // holds information about one job hostory log in the done // job history logs public static class JobHistoryJobRecord { public Path getPath() { ... } public String getJobIDString() { ... } public long getSubmitTime() { ... } public String getUserName() { ... } public String getJobName() { ... } } public class JobHistoryRecordRetriever implements Iterator { // usual Interface methods -- remove() throws UnsupportedOperationException // returns the number of calls to next() that will succeed public int numMatches() { ... } } // returns a JobHistoryRecordRetriever that delivers all Path's of job matching job history files, // in no particular order. Any criterion that is null or the empty string does not constrain. // All criteria that are specified are applied conjunctively, except that if there's more than // one date you retrieve all Path's matching ANY date. // soughtUser and soughtJobid must match exactly. // soughtJobName can match the entire job name or any substring. // dates must be in the format exactly MM/DD/ . // Dates' leading digits must be 2's . We're incubating a Y3K problem. public JobHistoryRecordRetriever getMatchingJob (String soughtUser, String soughtJobName, String[] dateStrings, String soughtJobid) throws IOException was: This patch does four things: * it changes the directory structure of the done directory that holds history logs for jobs that are completed, * it builds toy databases for completed jobs, so we no longer have to scan 2N files on DFS to find out facts about the N jobs that have completed since the job tracker started [which can be hundreds of thousands of files in practical cases], * it changes the job history browser to display more information and allow more filtering criteria, and * it creates a new programmatic interface for finding files matching user-chosen criteria. This allows users to no longer be concerned with our methods of storing them, in turn allowing us to change those at will. The new API described above, which can be used to programmatically obtain history file PATHs given search criteria, is described below: package org.apache.hadoop.mapreduce.jobhistory; ... // within JobHistory: // holds information about one job hostory log in the done // job history logs public static class JobHistoryJobRecord { public Path getPath() { ... } public String getJobIDString() { ... } public long getSubmitTime() { ... } public String getUserName() { ... } public String getJobName() { ... } } public class JobHistoryRecordRetriever implements Iterator { // usual Interface methods -- remove() throws UnsupportedOperationException // returns the number of calls to next() that will succeed public int numMatches() { ... } } // returns a JobHistoryRecordRetriever that delivers all Path's of job matching job history files, // in no particular order. Any criterion that is null or the empty string does not constrain. // All criteria that are specified are applied conjunctively, except that if there's more than // one date you retrieve all Path's matching ANY date. // soughtUser and soughtJobid must match exactly. // soughtJobName can match the entire job name or any substring. // dates must be in the format exactly MM/DD/ . // Dates' leading digits must be 2's . We're incubating a Y3K problem. public JobHistoryRecordRetriever (String soughtUser, String soughtJobName, String[] dateStrings, String soughtJobid)