[jira] Updated: (MAPREDUCE-1088) JobHistory files should have narrower 0600 perms
[ https://issues.apache.org/jira/browse/MAPREDUCE-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-1088: - Attachment: MAPREDUCE-1088_yhadoop20.patch Emergency bug-fix to yahoo hadoop20 distribution along-with HADOOP-6304 - I'll upload one for trunk shortly with narrower perms. > JobHistory files should have narrower 0600 perms > - > > Key: MAPREDUCE-1088 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1088 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 0.20.1 >Reporter: Arun C Murthy >Assignee: Arun C Murthy > Fix For: 0.20.2 > > Attachments: MAPREDUCE-1088_yhadoop20.patch > > > Currently the perms on JobHistory files are 0740, I propose we make it 0600. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1088) JobHistory files should have narrower 0600 perms
[ https://issues.apache.org/jira/browse/MAPREDUCE-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-1088: - Affects Version/s: 0.20.1 Fix Version/s: 0.20.2 Assignee: Arun C Murthy > JobHistory files should have narrower 0600 perms > - > > Key: MAPREDUCE-1088 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1088 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 0.20.1 >Reporter: Arun C Murthy >Assignee: Arun C Murthy > Fix For: 0.20.2 > > > Currently the perms on JobHistory files are 0740, I propose we make it 0600. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1103) Additional JobTracker metrics
[ https://issues.apache.org/jira/browse/MAPREDUCE-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sharad Agarwal updated MAPREDUCE-1103: -- Attachment: 1103.patch Updated to trunk > Additional JobTracker metrics > - > > Key: MAPREDUCE-1103 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1103 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobtracker >Affects Versions: 0.21.0 >Reporter: Arun C Murthy > Fix For: 0.21.0 > > Attachments: 1103.patch, 1103.patch > > > It would be useful for tracking the following additional JobTracker metrics: > running{map|reduce}tasks > busy{map|reduce}slots > reserved{map|reduce}slots -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-28) TestQueueManager takes too long and times out some times
[ https://issues.apache.org/jira/browse/MAPREDUCE-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767213#action_12767213 ] Hadoop QA commented on MAPREDUCE-28: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12422365/MAPREDUCE-28-8.patch against trunk revision 826565. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 34 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/181/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/181/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/181/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/181/console This message is automatically generated. > TestQueueManager takes too long and times out some times > > > Key: MAPREDUCE-28 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-28 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Amareshwari Sriramadasu >Assignee: V.V.Chaitanya Krishna > Fix For: 0.21.0 > > Attachments: MAPREDUCE-28-1.txt, MAPREDUCE-28-2.txt, > MAPREDUCE-28-3.txt, MAPREDUCE-28-4.txt, MAPREDUCE-28-5.txt, > MAPREDUCE-28-6.txt, MAPREDUCE-28-7.txt, MAPREDUCE-28-8.patch > > > TestQueueManager takes long time for the run and timeouts sometimes. > See the failure at > http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3875/testReport/. > Looking at the console output, before the test finsihes, it was timed-out. > On my machine, the test takes about 5 minutes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-972) distcp can timeout during rename operation to s3
[ https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767211#action_12767211 ] Aaron Kimball commented on MAPREDUCE-972: - Because there are other operations in the distcp job (e.g., the {{write()}} calls made during the actual upload) that should timeout far faster than once per thirty minutes in the event of an error. Using a single timeout value for all operations makes program execution overall considerably less efficient than it should be. Writes and renames in distcp can expect different running times; we should treat them this way. > distcp can timeout during rename operation to s3 > > > Key: MAPREDUCE-972 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-972 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 0.20.1 >Reporter: Aaron Kimball >Assignee: Aaron Kimball > Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, > MAPREDUCE-972.4.patch, MAPREDUCE-972.5.patch, MAPREDUCE-972.patch > > > rename() in S3 is implemented as copy + delete. The S3 copy operation can > perform very slowly, which may cause task timeout. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
[ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767210#action_12767210 ] rahul k singh commented on MAPREDUCE-1105: -- Summary for yahoo distribution patch. == - Remove the existing "mapred.capacity-scheduler.queue..max.map.slots" and "mapred.capacity-scheduler.queue..max.reduce.slots" variables , these where used to throttle the queue, i.e, these were the hard limit and not allowing queue to grow further. - Added the new parameter "mapred.capacity-scheduler.queue..maximum-capacity" maximum-capacity defines a limit beyond which a queue cannot use the capacity of the cluster. This provides a means to limit how much excess capacity a queue can use. By default, there is no limit.The maximum-capacity of a queue can only be greater than or equal to its minimum capacity. Default value of -1 implies a queue can use complete capacity of the cluster. This property could be to curtail certain jobs which are long running in nature from occupying more than a certain percentage of the cluster, which in the absence of pre-emption, could lead to capacity guarantees of other queues being affected. One important thing to note is that maximum-capacity is a percentage , so based on the cluster's capacity the max capacity would change. So if large no of nodes or racks get added to the cluster , max Capacity in absolute terms would increase accordingly. - Added some testcases for unit testing the maximum-capacity knob. - remove testcase for max.map.slots and max.reduce.slots. Summary of changes for patch 21. === - Removed "mapred.capacity-scheduler.queue..max.map.slots" and "mapred.capacity-scheduler.queue..max.reduce.slots" entries. - Removed testcases for the same. > CapacityScheduler: It should be possible to set queue hard-limit beyond it's > actual capacity > > > Key: MAPREDUCE-1105 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/capacity-sched >Affects Versions: 0.21.0 >Reporter: Arun C Murthy >Priority: Blocker > Fix For: 0.21.0 > > Attachments: MAPRED-1105-21-1.patch, > MAPREDUCE-1105-version20.patch.txt > > > Currently the CS caps a queue's capacity to it's actual capacity if a > hard-limit is specified to be greater than it's actual capacity. We should > allow the queue to go upto the hard-limit if specified. > Also, I propose we change the hard-limit unit to be percentage rather than > #slots. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
[ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767209#action_12767209 ] rahul k singh commented on MAPREDUCE-1105: -- Summary for yahoo distribution patch. == - Remove the existing "mapred.capacity-scheduler.queue..max.map.slots" and "mapred.capacity-scheduler.queue..max.reduce.slots" variables , these where used to throttle the queue, i.e, these were the hard limit and not allowing queue to grow further. -Added the new parameter "mapred.capacity-scheduler.queue..maximum-capacity" maximum-capacity defines a limit beyond which a queue cannot use the capacity of the cluster. This provides a means to limit how much excess capacity a queue can use. By default, there is no limit.The maximum-capacity of a queue can only be greater than or equal to its minimum capacity. Default value of -1 implies a queue can use complete capacity of the cluster. This property could be to curtail certain jobs which are long running in nature from occupying more than a certain percentage of the cluster, which in the absence of pre-emption, could lead to capacity guarantees of other queues being affected. One important thing to note is that maximum-capacity is a percentage , so based on the cluster's capacity the max capacity would change. So if large no of nodes or racks get added to the cluster , max Capacity in absolute terms would increase accordingly. -Added some testcases for unit testing the maximum-capacity knob. -remove testcase for max.map.slots and max.reduce.slots. Summary of changes for patch 21. === -Removed "mapred.capacity-scheduler.queue..max.map.slots" and "mapred.capacity-scheduler.queue..max.reduce.slots" entries. -Removed testcases for the same. > CapacityScheduler: It should be possible to set queue hard-limit beyond it's > actual capacity > > > Key: MAPREDUCE-1105 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/capacity-sched >Affects Versions: 0.21.0 >Reporter: Arun C Murthy >Priority: Blocker > Fix For: 0.21.0 > > Attachments: MAPRED-1105-21-1.patch, > MAPREDUCE-1105-version20.patch.txt > > > Currently the CS caps a queue's capacity to it's actual capacity if a > hard-limit is specified to be greater than it's actual capacity. We should > allow the queue to go upto the hard-limit if specified. > Also, I propose we change the hard-limit unit to be percentage rather than > #slots. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-28) TestQueueManager takes too long and times out some times
[ https://issues.apache.org/jira/browse/MAPREDUCE-28?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rahul k singh updated MAPREDUCE-28: --- Status: Patch Available (was: Open) > TestQueueManager takes too long and times out some times > > > Key: MAPREDUCE-28 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-28 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Amareshwari Sriramadasu >Assignee: V.V.Chaitanya Krishna > Fix For: 0.21.0 > > Attachments: MAPREDUCE-28-1.txt, MAPREDUCE-28-2.txt, > MAPREDUCE-28-3.txt, MAPREDUCE-28-4.txt, MAPREDUCE-28-5.txt, > MAPREDUCE-28-6.txt, MAPREDUCE-28-7.txt, MAPREDUCE-28-8.patch > > > TestQueueManager takes long time for the run and timeouts sometimes. > See the failure at > http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3875/testReport/. > Looking at the console output, before the test finsihes, it was timed-out. > On my machine, the test takes about 5 minutes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1070) Deadlock in FairSchedulerServlet
[ https://issues.apache.org/jira/browse/MAPREDUCE-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767186#action_12767186 ] Hudson commented on MAPREDUCE-1070: --- Integrated in Hadoop-Mapreduce-trunk-Commit #84 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/84/]) . Prevent a deadlock in the fair scheduler servlet. Contributed by Todd Lipcon > Deadlock in FairSchedulerServlet > > > Key: MAPREDUCE-1070 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1070 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.20.1, 0.21.0, 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Fix For: 0.20.2 > > Attachments: deadlock.png, mapreduce-1070-branch20.txt, > mapreduce-1070.txt > > > FairSchedulerServlet can cause a deadlock with the JobTracker -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1113) mumak compiles aspects even if skip.contrib is true
[ https://issues.apache.org/jira/browse/MAPREDUCE-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767183#action_12767183 ] Chris Douglas commented on MAPREDUCE-1113: -- I wouldn't oppose adding the ant-contrib dep in principle and binary tarballs without contrib would be great, but this is quickly expanding outside its original scope. If this is going into 0.21, then a light touch would be strongly preferred. Perhaps a separate issue for refactoring the build would be appropriate? Either that, or this can be appropriated/renamed for that purpose while MAPREDUCE-1038 can track the excessive Mumak aspect generation. > mumak compiles aspects even if skip.contrib is true > --- > > Key: MAPREDUCE-1113 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1113 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: build, contrib/mumak >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Fix For: 0.21.0, 0.22.0 > > Attachments: mapreduce-1113.txt > > > The compile-aspects task in mumak's build.xml runs regardless of the > skip.contrib property. Momentarily uploading a patch to fix this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1112) Fix CombineFileInputFormat for hadoop 0.20
[ https://issues.apache.org/jira/browse/MAPREDUCE-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated MAPREDUCE-1112: -- Resolution: Duplicate Status: Resolved (was: Patch Available) Duplicate of HADOOP-5759. > Fix CombineFileInputFormat for hadoop 0.20 > -- > > Key: MAPREDUCE-1112 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1112 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.20.1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 0.20.2 > > Attachments: MAPREDUCE-1112.1.patch, MAPREDUCE-1112.2.patch > > > HADOOP-5759 is already fixed as a part of MAPREDUCE-364 in hadoop 0.21. > This will fix the same problem with CombineFileInputFormat for hadoop 0.20. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-847) Adding Apache License Headers and reduce releaseaudit warnings to zero
[ https://issues.apache.org/jira/browse/MAPREDUCE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-847: Status: Open (was: Patch Available) * This should exclude {{src/test/tools/data/rumen}}, perhaps even {{src/test/tools/data}} ; some of the files are compressed in trunk (MAPREDUCE-1077) * libhdfs may be moved to the HDFS project soon (MAPREDUCE-665); the excludes should be removed when/if it does Other than that, this looks good > Adding Apache License Headers and reduce releaseaudit warnings to zero > -- > > Key: MAPREDUCE-847 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-847 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.22.0 >Reporter: Giridharan Kesavan >Assignee: Giridharan Kesavan > Attachments: MAPREDUCE-847-v1.PATCH, MAPREDUCE-847.PATCH > > > [rat:report] Summary > [rat:report] --- > [rat:report] Notes: 14 > [rat:report] Binaries: 178 > [rat:report] Archives: 49 > [rat:report] Standards: 1364 > [rat:report] > [rat:report] Apache Licensed: 1152 > [rat:report] Generated Documents: 9 > [rat:report] > [rat:report] JavaDocs are generated and so license header is optional > [rat:report] Generated files do not required license headers > [rat:report] > [rat:report] 203 Unknown Licenses -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1041) TaskStatuses map in TaskInProgress should be made package private instead of protected
[ https://issues.apache.org/jira/browse/MAPREDUCE-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767178#action_12767178 ] Hudson commented on MAPREDUCE-1041: --- Integrated in Hadoop-Mapreduce-trunk-Commit #83 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/83/]) . Make TaskInProgress::taskStatuses map package-private. Contributed by Jothi Padmanabhan > TaskStatuses map in TaskInProgress should be made package private instead of > protected > -- > > Key: MAPREDUCE-1041 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1041 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.21.0 >Reporter: Jothi Padmanabhan >Assignee: Jothi Padmanabhan >Priority: Minor > Fix For: 0.21.0 > > Attachments: mapred-1041.patch > > > MAPREDUCE-1028 made TaskStatuses protected. As Nigel pointed out in that > Jira, making it package private would suffice. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1012) Context interfaces should be Public Evolving
[ https://issues.apache.org/jira/browse/MAPREDUCE-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767179#action_12767179 ] Hudson commented on MAPREDUCE-1012: --- Integrated in Hadoop-Mapreduce-trunk-Commit #83 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/83/]) . Mark Context interfaces as public evolving. Contributed by Tom White > Context interfaces should be Public Evolving > > > Key: MAPREDUCE-1012 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1012 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client >Affects Versions: 0.21.0 >Reporter: Tom White >Assignee: Tom White >Priority: Blocker > Fix For: 0.21.0 > > Attachments: MAPREDUCE-1012.patch > > > As discussed in MAPREDUCE-954 the nascent context interfaces should be marked > as Public Evolving to facilitate future evolution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-972) distcp can timeout during rename operation to s3
[ https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767176#action_12767176 ] Chris Douglas commented on MAPREDUCE-972: - Why wouldn't one just set the task timeout for the distcp job to 30 minutes? > distcp can timeout during rename operation to s3 > > > Key: MAPREDUCE-972 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-972 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 0.20.1 >Reporter: Aaron Kimball >Assignee: Aaron Kimball > Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, > MAPREDUCE-972.4.patch, MAPREDUCE-972.5.patch, MAPREDUCE-972.patch > > > rename() in S3 is implemented as copy + delete. The S3 copy operation can > perform very slowly, which may cause task timeout. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1112) Fix CombineFileInputFormat for hadoop 0.20
[ https://issues.apache.org/jira/browse/MAPREDUCE-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767174#action_12767174 ] Zheng Shao commented on MAPREDUCE-1112: --- My bad. Yes it's the same as HADOOP-5759, except a file name change. I will commit HADOOP-5759 to branch-0.20, and mark this one as duplicate. > Fix CombineFileInputFormat for hadoop 0.20 > -- > > Key: MAPREDUCE-1112 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1112 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.20.1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 0.20.2 > > Attachments: MAPREDUCE-1112.1.patch, MAPREDUCE-1112.2.patch > > > HADOOP-5759 is already fixed as a part of MAPREDUCE-364 in hadoop 0.21. > This will fix the same problem with CombineFileInputFormat for hadoop 0.20. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-932) Rumen needs a job trace sorter
[ https://issues.apache.org/jira/browse/MAPREDUCE-932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-932: Status: Open (was: Patch Available) Could this compress the data for the testcase, as in MAPREDUCE-1077? > Rumen needs a job trace sorter > -- > > Key: MAPREDUCE-932 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-932 > Project: Hadoop Map/Reduce > Issue Type: New Feature >Reporter: Dick King >Assignee: Dick King > Attachments: MAPREDUCE-932--2009-09-18-PM.patch, > MAPREDUCE-932--2009-09-18.patch, patch-932--2009-08-31--1702.patch > > > Rumen reads job history logs and produces job traces. The jobs in a job > trace do not occur in any promised order. Certain tools need the jobs to be > ordered by job submission time. We should include, in Rumen, a tool to sort > job traces. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-892) command line tool to list all tasktrackers and their status
[ https://issues.apache.org/jira/browse/MAPREDUCE-892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-892: Status: Open (was: Patch Available) Canceling patch, as it has gone stale. > command line tool to list all tasktrackers and their status > --- > > Key: MAPREDUCE-892 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-892 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.21.0 >Reporter: dhruba borthakur >Assignee: Dmytro Molkov > Attachments: MAPRED-892.patch.3, MAPREDUCE-892.patch, > MAPREDUCE-892.patch, MAPREDUCE-892.patch.1 > > > The "hadoop mradmin -report" could list all the tasktrackers that the > JobTracker knows about. It will also list a brief status summary for each of > the TaskTracker. (This is similar to the hadop dfsadmin -report command that > lists all Datanodes) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1070) Deadlock in FairSchedulerServlet
[ https://issues.apache.org/jira/browse/MAPREDUCE-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1070: - Resolution: Fixed Fix Version/s: 0.20.2 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I committed this. Thanks, Todd! > Deadlock in FairSchedulerServlet > > > Key: MAPREDUCE-1070 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1070 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.20.1, 0.21.0, 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Fix For: 0.20.2 > > Attachments: deadlock.png, mapreduce-1070-branch20.txt, > mapreduce-1070.txt > > > FairSchedulerServlet can cause a deadlock with the JobTracker -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-962) NPE in ProcfsBasedProcessTree.destroy()
[ https://issues.apache.org/jira/browse/MAPREDUCE-962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-962: Description: This causes the following exception in TaskMemoryManagerThread. I observed this while running TestTaskTrackerMemoryManager. {code} 2009-09-02 12:08:25,835 WARN mapred.TaskMemoryManagerThread (TaskMemoryManagerThread.java:run(239)) - \ Uncaught exception in TaskMemoryManager while managing memory of attempt_20090902120812252_0001_m_03_0 : \ java.lang.NullPointerException at org.apache.hadoop.util.ProcfsBasedProcessTree.assertPidPgrpidForMatch(ProcfsBasedProcessTree.java:234) at org.apache.hadoop.util.ProcfsBasedProcessTree.assertAndDestroyProcessGroup(ProcfsBasedProcessTree.java:257) at org.apache.hadoop.util.ProcfsBasedProcessTree.destroy(ProcfsBasedProcessTree.java:286) at org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:229) {code} was: This causes the following exception in TaskMemoryManagerThread. I observed this while running TestTaskTrackerMemoryManager. {code} 2009-09-02 12:08:25,835 WARN mapred.TaskMemoryManagerThread (TaskMemoryManagerThread.java:run(239)) - Uncaught exception in TaskMemoryManager while managing memory of attempt_20090902120812252_0001_m_03_0 : java.lang.NullPointerException at org.apache.hadoop.util.ProcfsBasedProcessTree.assertPidPgrpidForMatch(ProcfsBasedProcessTree.java:234) at org.apache.hadoop.util.ProcfsBasedProcessTree.assertAndDestroyProcessGroup(ProcfsBasedProcessTree.java:257) at org.apache.hadoop.util.ProcfsBasedProcessTree.destroy(ProcfsBasedProcessTree.java:286) at org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:229) {code} > NPE in ProcfsBasedProcessTree.destroy() > --- > > Key: MAPREDUCE-962 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-962 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker >Reporter: Vinod K V >Assignee: Ravi Gummadi > Fix For: 0.21.0 > > Attachments: HADOOP-6232.patch, MR-962.patch, MR-962.v1.patch > > > This causes the following exception in TaskMemoryManagerThread. I observed > this while running TestTaskTrackerMemoryManager. > {code} > 2009-09-02 12:08:25,835 WARN mapred.TaskMemoryManagerThread > (TaskMemoryManagerThread.java:run(239)) - \ > Uncaught exception in TaskMemoryManager while managing memory of > attempt_20090902120812252_0001_m_03_0 : \ > java.lang.NullPointerException > at > org.apache.hadoop.util.ProcfsBasedProcessTree.assertPidPgrpidForMatch(ProcfsBasedProcessTree.java:234) > at > org.apache.hadoop.util.ProcfsBasedProcessTree.assertAndDestroyProcessGroup(ProcfsBasedProcessTree.java:257) > at > org.apache.hadoop.util.ProcfsBasedProcessTree.destroy(ProcfsBasedProcessTree.java:286) > at > org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:229) > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1041) TaskStatuses map in TaskInProgress should be made package private instead of protected
[ https://issues.apache.org/jira/browse/MAPREDUCE-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1041: - Resolution: Fixed Fix Version/s: (was: 0.22.0) 0.21.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) +1 I committed this. Thanks, Jothi! > TaskStatuses map in TaskInProgress should be made package private instead of > protected > -- > > Key: MAPREDUCE-1041 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1041 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.21.0 >Reporter: Jothi Padmanabhan >Assignee: Jothi Padmanabhan >Priority: Minor > Fix For: 0.21.0 > > Attachments: mapred-1041.patch > > > MAPREDUCE-1028 made TaskStatuses protected. As Nigel pointed out in that > Jira, making it package private would suffice. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-972) distcp can timeout during rename operation to s3
[ https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767166#action_12767166 ] Aaron Kimball commented on MAPREDUCE-972: - My proposal is slightly different than that: The progress thread is in one of three states: 1) {{inRename = true && isComplete == false}} 2) {{inRename = false && isComplete == false}} 3) {{isComplete = true}} When inRename is set to true, the progress thread will call {{progress()}} every few seconds, for up to a max of {{distcp.rename.timeout}} seconds. If it is still in this state after {{distcp.rename.timeout}} seconds have elapsed since the state began, it will set inRename to false. When inRename is false, it just sits there, waiting for another rename operation to start. It sleeps and occasionally polls for a state change on inRename or isComplete. Changing inRename back to true again will go into the previously-described state; {{distcp.rename.timeout}} starts anew from this time point. If isComplete is true, the thread exits immediately. The {{Mapper.close()}} method will set isComplete to true to ensure that the thread shuts down. (As the thread is {{setDaemon(true)}}, the JVM will exit even without this detail, but it is good hygeine to do so anyway.) It is not sufficient to simply call progress() right before rename(). Experience has shown that when uploading large files to S3, the rename() operation itself can take in excess of 10 minutes. rename() in S3 is implemented as copy-and-delete. For multi-GB files, this can take a long time. If we just tell people to set their global task timeout to 30 minutes, then this will delay task restarts under other conditions where the timeout value is expected to be considerably shorter (e.g., an individual file {{write()}} operation). This can adversely affect distcp performance in the general case. > distcp can timeout during rename operation to s3 > > > Key: MAPREDUCE-972 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-972 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 0.20.1 >Reporter: Aaron Kimball >Assignee: Aaron Kimball > Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, > MAPREDUCE-972.4.patch, MAPREDUCE-972.5.patch, MAPREDUCE-972.patch > > > rename() in S3 is implemented as copy + delete. The S3 copy operation can > perform very slowly, which may cause task timeout. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1012) Context interfaces should be Public Evolving
[ https://issues.apache.org/jira/browse/MAPREDUCE-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1012: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I committed this. Thanks, Tom! > Context interfaces should be Public Evolving > > > Key: MAPREDUCE-1012 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1012 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client >Affects Versions: 0.21.0 >Reporter: Tom White >Assignee: Tom White >Priority: Blocker > Fix For: 0.21.0 > > Attachments: MAPREDUCE-1012.patch > > > As discussed in MAPREDUCE-954 the nascent context interfaces should be marked > as Public Evolving to facilitate future evolution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-972) distcp can timeout during rename operation to s3
[ https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767161#action_12767161 ] Chris Douglas commented on MAPREDUCE-972: - I see. Extending the FileSystem API is a non-starter, so we can move on from that. Progress threads in general are discouraged (e.g. HADOOP-5052). If I understand your proposal, the progress thread would report starting from the first rename, but stop after some configurable interval. In most cases, I'm not sure how this would differ from simply setting the task timeout higher, since progress is reported between renames. Also, this wouldn't help renames after the thread exits. Would it be sufficient to add a call to progress() right before the rename (after the delete)? In that case, setting the task timeout higher would extend the time allowed for each rename, which is the right level of granularity, anyway. It won't do this automatically for s3 destinations, but pushing that detail into distcp is not ideal, either. One could add a FilterFileSystem that resets a persistent progress thread for each rename, manage all the signaling/locking etc., but its behavior seems indistinguishable from this much simpler tweak. Would this be sufficient? > distcp can timeout during rename operation to s3 > > > Key: MAPREDUCE-972 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-972 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 0.20.1 >Reporter: Aaron Kimball >Assignee: Aaron Kimball > Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, > MAPREDUCE-972.4.patch, MAPREDUCE-972.5.patch, MAPREDUCE-972.patch > > > rename() in S3 is implemented as copy + delete. The S3 copy operation can > perform very slowly, which may cause task timeout. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-987) Exposing MiniDFS and MiniMR clusters as a single process command-line
[ https://issues.apache.org/jira/browse/MAPREDUCE-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767135#action_12767135 ] Chris Douglas commented on MAPREDUCE-987: - This seems appropriate for the test jar. Small notes: * This picks up \-D params like the generic parser; would it make sense to also accept \-conf? The other params make less sense in this context, though it may be worth considering Tool/ToolRunner * It'd be better if sleepForever monitored the Mini\*Cluster, rather than waking up every minute for no reason. Not sure if it makes sense to include a poison pill (Path?) + configurable polling interval that might signal an orderly shutdown. * If this is intended for tests, should {{start}} wait for the TT/DNs to come up before returning? > Exposing MiniDFS and MiniMR clusters as a single process command-line > - > > Key: MAPREDUCE-987 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-987 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: build, test >Reporter: Philip Zeyliger >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HDFS-621-0.20-patch, HDFS-621.patch, MAPREDUCE-987.patch > > > It's hard to test non-Java programs that rely on significant mapreduce > functionality. The patch I'm proposing shortly will let you just type > "bin/hadoop jar hadoop-hdfs-hdfswithmr-test.jar minicluster" to start a > cluster (internally, it's using Mini{MR,HDFS}Cluster) with a specified number > of daemons, etc. A test that checks how some external process interacts with > Hadoop might start minicluster as a subprocess, run through its thing, and > then simply kill the java subprocess. > I've been using just such a system for a couple of weeks, and I like it. > It's significantly easier than developing a lot of scripts to start a > pseudo-distributed cluster, and then clean up after it. I figure others > might find it useful as well. > I'm at a bit of a loss as to where to put it in 0.21. hdfs-with-mr tests > have all the required libraries, so I've put it there. I could conceivably > split this into "minimr" and "minihdfs", but it's specifically the fact that > they're configured to talk to each other that I like about having them > together. And one JVM is better than two for my test programs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-972) distcp can timeout during rename operation to s3
[ https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767129#action_12767129 ] Aaron Kimball commented on MAPREDUCE-972: - As discussed earlier, the FileSystem API does not provide a means for operations such as rename() to get access to a Progressable. I do not see a straightforward way to improve the S3FS / S3N implementations without extending the FileSystem API to add operations such as {{rename(src, dst, progress)}}. Are you +1 on doing that? Either way, I agree with your criticisms of the progress thread implementation. I have the following plan for improving this: * Make the progress thread's lifetime equal to that of the mapper. The first rename() operation starts it, and the join() moves to close() * Progress thread is only active when a rename() operation is underway. Use a volatile boolean to track this state. Otherwise it just sleeps. * Use {{Thread.interrupt()}} / {{isInterrupted()}} to interrupt the sleep in the main loop, so that we don't have to wait the full three seconds before the thread exits. * Add {{distcp.rename.timeout}} as a parameter which sets a max lifetime for the inner loop of the progress thread. Default value will be 10 seconds, but if it detects that the destination filesystem is s3n:// or s3fs://, ups this to fifteen minutes. - Aaron > distcp can timeout during rename operation to s3 > > > Key: MAPREDUCE-972 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-972 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 0.20.1 >Reporter: Aaron Kimball >Assignee: Aaron Kimball > Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, > MAPREDUCE-972.4.patch, MAPREDUCE-972.5.patch, MAPREDUCE-972.patch > > > rename() in S3 is implemented as copy + delete. The S3 copy operation can > perform very slowly, which may cause task timeout. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1077) When rumen reads a truncated job tracker log, it produces a job whose outcome is SUCCESS. Should be null.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767126#action_12767126 ] Hudson commented on MAPREDUCE-1077: --- Integrated in Hadoop-Mapreduce-trunk-Commit #82 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/82/]) . Fix Rumen so that truncated tasks do not mark the job as successful. Contributed by Dick King > When rumen reads a truncated job tracker log, it produces a job whose outcome > is SUCCESS. Should be null. > -- > > Key: MAPREDUCE-1077 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1077 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.21.0 >Reporter: Dick King >Assignee: Dick King > Fix For: 0.21.0 > > Attachments: mapreduce-1077--2009-10-14.patch, > mapreduce-1077--2009-10-16.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-931) rumen should use its own interpolation classes to create runtimes for simulated tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767121#action_12767121 ] Hudson commented on MAPREDUCE-931: -- Integrated in Hadoop-Mapreduce-trunk-Commit #81 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/81/]) . Use built-in interpolation classes for making up task runtimes in Rumen. Contributed by Dick King > rumen should use its own interpolation classes to create runtimes for > simulated tasks > - > > Key: MAPREDUCE-931 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-931 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Dick King >Assignee: Dick King >Priority: Minor > Fix For: 0.21.0 > > Attachments: MAPREDUCE-931--2009-09-16--1633.patch, patch-931-b.patch > > > Currently, when a simulator or benchmark is running and simulating hadoop > jobs using rumen data, and rumen's runtime system is used to get execution > times for the tasks in the simulated jobs, rumen would use some ad hoc code, > despite the fact that rumen has a perfectly good interpolation framework to > generate random variables that fit discrete CDFs. > We should use the interpolation framework. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1077) When rumen reads a truncated job tracker log, it produces a job whose outcome is SUCCESS. Should be null.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1077: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I committed this. Thanks, Dick! > When rumen reads a truncated job tracker log, it produces a job whose outcome > is SUCCESS. Should be null. > -- > > Key: MAPREDUCE-1077 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1077 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.21.0 >Reporter: Dick King >Assignee: Dick King > Fix For: 0.21.0 > > Attachments: mapreduce-1077--2009-10-14.patch, > mapreduce-1077--2009-10-16.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-931) rumen should use its own interpolation classes to create runtimes for simulated tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-931: Resolution: Fixed Fix Version/s: 0.21.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I committed this. Thanks, Dick! > rumen should use its own interpolation classes to create runtimes for > simulated tasks > - > > Key: MAPREDUCE-931 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-931 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Dick King >Assignee: Dick King >Priority: Minor > Fix For: 0.21.0 > > Attachments: MAPREDUCE-931--2009-09-16--1633.patch, patch-931-b.patch > > > Currently, when a simulator or benchmark is running and simulating hadoop > jobs using rumen data, and rumen's runtime system is used to get execution > times for the tasks in the simulated jobs, rumen would use some ad hoc code, > despite the fact that rumen has a perfectly good interpolation framework to > generate random variables that fit discrete CDFs. > We should use the interpolation framework. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-990) Making distributed cache getters in JobContext never return null
[ https://issues.apache.org/jira/browse/MAPREDUCE-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-990: Status: Open (was: Patch Available) The patch is stale > Making distributed cache getters in JobContext never return null > > > Key: MAPREDUCE-990 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-990 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Philip Zeyliger >Assignee: Philip Zeyliger >Priority: Minor > Attachments: MAPREDUCE-990.patch, MAPREDUCE-990.patch.txt > > > MAPREDUCE-898 moved distributed cache setters and getters into Job and > JobContext. Since the API is new, I'd like to propose that those getters > never return null, but instead always return an array, even if it's empty. > If people don't like this change, I can instead merely update the javadoc to > reflect the fact that null may be returned. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-906) Updated Sqoop documentation
[ https://issues.apache.org/jira/browse/MAPREDUCE-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767079#action_12767079 ] Hudson commented on MAPREDUCE-906: -- Integrated in Hadoop-Mapreduce-trunk #116 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/116/]) . Update Sqoop documentation. Contributed by Aaron Kimball > Updated Sqoop documentation > --- > > Key: MAPREDUCE-906 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-906 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/sqoop >Reporter: Aaron Kimball >Assignee: Aaron Kimball > Fix For: 0.22.0 > > Attachments: MAPREDUCE-906.2.patch, MAPREDUCE-906.3.patch, > MAPREDUCE-906.4.patch, MAPREDUCE-906.patch > > > Here's the latest documentation for Sqoop, in both user-guide and manpage > form. Built with asciidoc. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1104) RecoveryManager not initialized in SimulatorJobTracker led to NPE in JT Jetty server
[ https://issues.apache.org/jira/browse/MAPREDUCE-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767080#action_12767080 ] Hudson commented on MAPREDUCE-1104: --- Integrated in Hadoop-Mapreduce-trunk #116 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/116/]) . Initialize RecoveryManager in JobTracker cstr called by Mumak. Contributed by Hong Tang > RecoveryManager not initialized in SimulatorJobTracker led to NPE in JT Jetty > server > > > Key: MAPREDUCE-1104 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1104 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/mumak >Affects Versions: 0.21.0, 0.22.0 >Reporter: Hong Tang >Assignee: Hong Tang > Fix For: 0.21.0 > > Attachments: mapreduce-1104-20091014.patch, mapreduce-1104.patch > > > RecoveryManager initialization is not copied to the JobTracker constructor > Mumak depends on. This leads to NPE in JT Jetty server. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1061) Gridmix unit test should validate input/output bytes
[ https://issues.apache.org/jira/browse/MAPREDUCE-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767081#action_12767081 ] Hudson commented on MAPREDUCE-1061: --- Integrated in Hadoop-Mapreduce-trunk #116 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/116/]) . Add unit test validating byte specifications for gridmix jobs. > Gridmix unit test should validate input/output bytes > > > Key: MAPREDUCE-1061 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1061 > Project: Hadoop Map/Reduce > Issue Type: Test >Affects Versions: 0.21.0 >Reporter: Chris Douglas >Assignee: Chris Douglas > Fix For: 0.21.0 > > Attachments: 1061-0.patch, M1061-1.patch, M1061-2.patch > > > TestGridmixSubmission currently verifies only that the correct number of jobs > have been run. The test should validate the I/O parameters it claims to > satisfy. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1061) Gridmix unit test should validate input/output bytes
[ https://issues.apache.org/jira/browse/MAPREDUCE-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767044#action_12767044 ] Hudson commented on MAPREDUCE-1061: --- Integrated in Hadoop-Mapreduce-trunk-Commit #80 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/80/]) . Add unit test validating byte specifications for gridmix jobs. > Gridmix unit test should validate input/output bytes > > > Key: MAPREDUCE-1061 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1061 > Project: Hadoop Map/Reduce > Issue Type: Test >Affects Versions: 0.21.0 >Reporter: Chris Douglas >Assignee: Chris Douglas > Fix For: 0.21.0 > > Attachments: 1061-0.patch, M1061-1.patch, M1061-2.patch > > > TestGridmixSubmission currently verifies only that the correct number of jobs > have been run. The test should validate the I/O parameters it claims to > satisfy. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1104) RecoveryManager not initialized in SimulatorJobTracker led to NPE in JT Jetty server
[ https://issues.apache.org/jira/browse/MAPREDUCE-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767040#action_12767040 ] Hudson commented on MAPREDUCE-1104: --- Integrated in Hadoop-Mapreduce-trunk-Commit #79 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/79/]) . Initialize RecoveryManager in JobTracker cstr called by Mumak. Contributed by Hong Tang > RecoveryManager not initialized in SimulatorJobTracker led to NPE in JT Jetty > server > > > Key: MAPREDUCE-1104 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1104 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/mumak >Affects Versions: 0.21.0, 0.22.0 >Reporter: Hong Tang >Assignee: Hong Tang > Fix For: 0.21.0 > > Attachments: mapreduce-1104-20091014.patch, mapreduce-1104.patch > > > RecoveryManager initialization is not copied to the JobTracker constructor > Mumak depends on. This leads to NPE in JT Jetty server. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1061) Gridmix unit test should validate input/output bytes
[ https://issues.apache.org/jira/browse/MAPREDUCE-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1061: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I committed this. > Gridmix unit test should validate input/output bytes > > > Key: MAPREDUCE-1061 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1061 > Project: Hadoop Map/Reduce > Issue Type: Test >Affects Versions: 0.21.0 >Reporter: Chris Douglas >Assignee: Chris Douglas > Fix For: 0.21.0 > > Attachments: 1061-0.patch, M1061-1.patch, M1061-2.patch > > > TestGridmixSubmission currently verifies only that the correct number of jobs > have been run. The test should validate the I/O parameters it claims to > satisfy. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-906) Updated Sqoop documentation
[ https://issues.apache.org/jira/browse/MAPREDUCE-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767035#action_12767035 ] Hudson commented on MAPREDUCE-906: -- Integrated in Hadoop-Mapreduce-trunk-Commit #78 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/78/]) . Update Sqoop documentation. Contributed by Aaron Kimball > Updated Sqoop documentation > --- > > Key: MAPREDUCE-906 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-906 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/sqoop >Reporter: Aaron Kimball >Assignee: Aaron Kimball > Fix For: 0.22.0 > > Attachments: MAPREDUCE-906.2.patch, MAPREDUCE-906.3.patch, > MAPREDUCE-906.4.patch, MAPREDUCE-906.patch > > > Here's the latest documentation for Sqoop, in both user-guide and manpage > form. Built with asciidoc. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1104) RecoveryManager not initialized in SimulatorJobTracker led to NPE in JT Jetty server
[ https://issues.apache.org/jira/browse/MAPREDUCE-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1104: - Resolution: Fixed Fix Version/s: (was: 0.22.0) Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) +1 I committed this. Thanks, Hong! > RecoveryManager not initialized in SimulatorJobTracker led to NPE in JT Jetty > server > > > Key: MAPREDUCE-1104 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1104 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/mumak >Affects Versions: 0.21.0, 0.22.0 >Reporter: Hong Tang >Assignee: Hong Tang > Fix For: 0.21.0 > > Attachments: mapreduce-1104-20091014.patch, mapreduce-1104.patch > > > RecoveryManager initialization is not copied to the JobTracker constructor > Mumak depends on. This leads to NPE in JT Jetty server. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-906) Updated Sqoop documentation
[ https://issues.apache.org/jira/browse/MAPREDUCE-906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-906: Resolution: Fixed Fix Version/s: 0.22.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) +1 I committed this. Thanks, Aaron! > Updated Sqoop documentation > --- > > Key: MAPREDUCE-906 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-906 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/sqoop >Reporter: Aaron Kimball >Assignee: Aaron Kimball > Fix For: 0.22.0 > > Attachments: MAPREDUCE-906.2.patch, MAPREDUCE-906.3.patch, > MAPREDUCE-906.4.patch, MAPREDUCE-906.patch > > > Here's the latest documentation for Sqoop, in both user-guide and manpage > form. Built with asciidoc. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-972) distcp can timeout during rename operation to s3
[ https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-972: Status: Open (was: Patch Available) Really sorry to find this issue so late, but a progress thread that forbids tasks from timing out is not a good solution, particularly for distcp, where task timeouts are both legal and useful. If s3 requires a more elaborate rename mechanism, is there a way to push this into its implementation? While distcp may be a heavier user than most user jobs, the latter would also appreciate a more robust solution. Starting and waiting a thread for every rename is also not an ideal design; the current polls {{isComplete}} only every three seconds, slowing all the renames. > distcp can timeout during rename operation to s3 > > > Key: MAPREDUCE-972 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-972 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 0.20.1 >Reporter: Aaron Kimball >Assignee: Aaron Kimball > Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, > MAPREDUCE-972.4.patch, MAPREDUCE-972.5.patch, MAPREDUCE-972.patch > > > rename() in S3 is implemented as copy + delete. The S3 copy operation can > perform very slowly, which may cause task timeout. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-64) Map-side sort is hampered by io.sort.record.percent
[ https://issues.apache.org/jira/browse/MAPREDUCE-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767022#action_12767022 ] Chris Douglas commented on MAPREDUCE-64: bq. one simple way might be to simply add TRACE level log messages at every collect() call with the current values of every index plus the spill number [...] That could be an interesting visualization. I'd already made up the diagrams, but anything that helps the analysis and validation would be welcome. I'd rather not add a trace to the committed code, but data from it sounds great. bq. I ran a simple test where I was running a sort of 10 byte records, and it turned out that the "optimal" io.sort.record.percent caused my job to be significantly slower. It was the case then that a small number of large spills actually ran slower than a large number of small spills. Did we ever determine what that issue was? I think we should try to understand why the theory isn't agreeing with observations here. IIRC those tests used a non-RawComparator, right? Runping reported similar results, where hits to concurrent collection were more expensive than small spills. The current theory is that keeping the map thread unblocked is usually better for performance. Based on this observation, I'm hoping that the spill.percent can also be eliminated at some point in the future, though the performance we're leaving on the table there is probably not as severe and is more difficult to generalize. Microbenchmarks may also not capture the expense of merging many small spills in a busy, shared cluster, where HDFS and other tasks are completing for disk bandwidth. I'll be very interested in metrics from MAPREDUCE-1115, as they would help to flesh out this hypothesis. The documentation (such as it is) in HADOOP-2919 describes the existing code. The metadata and serialization data are tracked using a set of indices marking the start and end of a spill ({{kvstart}}, {{kvend}}) and the current position ({{kvindex}}) while the serialization data are described by similar markers ({{bufstart}}, {{bufend}}, {{bufindex}}). There are two other indices carried over from the existing design. {{bufmark}} is the position in the serialized record data of the end of the last fully serialized record. {{bufvoid}} is necessary for the RawComparator interface, which requires contiguous ranges for key compares; if a serialized key crosses the end of the buffer, it must be copied to the front to satisfy the aforementioned API spec. All of these are retained; the role of each is largely unchanged. The proposed design adds another parameter, the {{equator}} (while {{kvstart}} and {{bufstart}} could be replaced with a single variable similar to {{equator}} the effort seemed misspent). The record serialization moves "forward" in the buffer, while the metadata are allocated in 16 byte blocks in the opposite direction. This is illustrated in the following diagram: !M64-0i.png|thumbnail! The role played by kvoffsets and kvindices is preserved; logically, particularly in the spill, each is interpreted in roughly the same way. In the new code, the allocation is not static, but will instead expand with the serialized records. This avoids degenerate cases for combiners and multilevel merges (though not necessarily optimal performance). Spills are triggered in two conditions: either the soft limit is reached (collection proceeds concurrently with the spill) or a record is large enough to require a spill before it can be written to the buffer (collection is blocked). In the former case is illustrated here: !M64-1i.png|thumbnail! The {{equator}} is moved to an offset proportional to the average record size (caveats [above|https://issues.apache.org/jira/browse/MAPREDUCE-64?focusedCommentId=12765984&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12765984]), {{kvindex}} is moved off the equator, aligned with the end of the array (int alignment, also so no metadata block will span the end of the array). Collection proceeds again from the equator, growing toward the ends of the spill. Should either run out of space, collection will block until the spill completes. Note that there is no partially written data when the soft limit is reached; it can only be triggered in collect, not in the blocking buffer. The other case to consider is when record data are partially written into the collection buffer, but the available space is exhausted: !M64-2i.png|thumbnail! Here, the equator is moved to the beginning of the partial record and collection blocks. When the spill completes, the metadata are written off the equator and serialization of the record can continue. During collection, indices are adjusted only when holding a lock. As in the current code, the lock is only obtained in collect when one of the possible conditions fo
[jira] Updated: (MAPREDUCE-64) Map-side sort is hampered by io.sort.record.percent
[ https://issues.apache.org/jira/browse/MAPREDUCE-64?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-64: --- Attachment: M64-0i.png M64-1i.png M64-2i.png > Map-side sort is hampered by io.sort.record.percent > --- > > Key: MAPREDUCE-64 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-64 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Arun C Murthy >Assignee: Chris Douglas > Attachments: M64-0.patch, M64-0i.png, M64-1.patch, M64-1i.png, > M64-2.patch, M64-2i.png, M64-3.patch > > > Currently io.sort.record.percent is a fairly obscure, per-job configurable, > expert-level parameter which controls how much accounting space is available > for records in the map-side sort buffer (io.sort.mb). Typically values for > io.sort.mb (100) and io.sort.record.percent (0.05) imply that we can store > ~350,000 records in the buffer before necessitating a sort/combine/spill. > However for many applications which deal with small records e.g. the > world-famous wordcount and it's family this implies we can only use 5-10% of > io.sort.mb i.e. (5-10M) before we spill inspite of having _much_ more memory > available in the sort-buffer. The word-count for e.g. results in ~12 spills > (given hdfs block size of 64M). The presence of a combiner exacerbates the > problem by piling serialization/deserialization of records too... > Sure, jobs can configure io.sort.record.percent, but it's tedious and > obscure; we really can do better by getting the framework to automagically > pick it by using all available memory (upto io.sort.mb) for either the data > or accounting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.