[jira] Updated: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-279: Description: Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. (was: We, at Yahoo!, have been using Hadoop-On-Demand as the resource provisioning/scheduling mechanism. With HoD the user uses a self-service system to ask-for a set of nodes. HoD allocates these from a global pool and also provisions a private Map-Reduce cluster for the user. She then runs her jobs and shuts the cluster down via HoD when done. All user-private clusters use the same humongous, static HDFS (e.g. 2k node HDFS). More details about HoD are available here: HADOOP-1301. h3. Motivation The current deployment (Hadoop + HoD) has a couple of implications: * _Non-optimal Cluster Utilization_ 1. Job-private Map-Reduce clusters imply that the user-cluster potentially could be *idle* for atleast a while before being detected and shut-down. 2. Elastic Jobs: Map-Reduce jobs, typically, have lots of maps with much-smaller no. of reduces; with maps being light and quick and reduces being i/o heavy and longer-running. Users typically allocate clusters depending on the no. of maps (i.e. input size) which leads to the scenario where all the maps are done (idle nodes in the cluster) and the few reduces are chugging along. Right now, we do not have the ability to shrink the HoD'ed Map-Reduce clusters which would alleviate this issue. * _Impact on data-locality_ With the current setup of a static, large HDFS and much smaller (5/10/20/50 node) clusters there is a good chance of losing one of Map-Reduce's primary features: ability to execute tasks on the datanodes where the input splits are located. In fact, we have seen the data-local tasks go down to 20-25 percent in the GridMix benchmarks, from the 95-98 percent we see on the randomwriter+sort runs run as part of the hadoopqa benchmarks (admittedly a synthetic benchmark, but yet). Admittedly, HADOOP-1985 (rack-aware Map-Reduce) helps significantly here. Primarily, the notion of *job-level scheduling* leading to private clusers, as opposed to *task-level scheduling*, is a good peg to hang-on the majority of the blame. Keeping the above factors in mind, here are some thoughts on how to re-structure Hadoop Map-Reduce to solve some of these issues. h3. State of the Art As it exists today, a large, static, Hadoop Map-Reduce cluster (forget HoD for a bit) does provide task-level scheduling; however as it exists today, it's scalability to tens-of-thousands of user-jobs, per-week, is in question. Lets review it's current architecture and main components: * JobTracker: It does both *task-scheduling* and *task-monitoring* (tasktrackers send task-statuses via periodic heartbeats), which implies it is fairly loaded. It is also a _single-point of failure_ in the Map-Reduce framework i.e. its failure implies that all the jobs in the system fail. This means a static, large Map-Reduce cluster is fairly susceptible and a definite suspect. Clearly HoD solves this by having per-job clusters, albeit with the above drawbacks. * TaskTracker: The slave in the system which executes one task at-a-time under directions from the JobTracker. * JobClient: The per-job client which just submits the job and polls the JobTracker for status. h3. Proposal - Map-Reduce 2.0 The primary idea is to move to task-level scheduling and static Map-Reduce clusters (so as to maintain the same storage cluster and compute cluster paradigm) as a way to directly tackle the two main issues illustrated above. Clearly, we will have to get around the existing problems, especially w.r.t. scalability and reliability. The proposal is to re-work Hadoop Map-Reduce to make it suitable for a large, static cluster. Here is an overview of how its main components would look like: * JobTracker: Turn the JobTracker into a pure task-scheduler, a global one. Lets call this the *JobScheduler* henceforth. Clearly (data-locality aware) Maui/Moab are candidates for being the scheduler, in which case, the JobScheduler is just a thin wrapper around them. * TaskTracker: These stay as before, without some minor changes as illustrated later in the piece. * JobClient: Fatten up the JobClient my putting a lot more intelligence into it. Enhance it to talk to the JobTracker to ask for available TaskTrackers and then contact them to schedule and monitor the tasks. So we'll have lots of per-job clients talking to the JobScheduler and the relevant TaskTrackers for their respective jobs, a big change from today. Lets call this the *JobManager* henceforth. A broad sketch of how things would work: h4. Deployment There is a single, static, large Map-Reduce cluster, and no per-job
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994693#comment-12994693 ] eric baldeschwieler commented on MAPREDUCE-279: --- We're having a baby! Todd Papaioannou (p9u) is action head of Hadoop. Most line issues can continue to go to Amol, Kazi, Satish, Avik or Senthil as appropriate. I'll be back on roughly march 9th. CUSoon, E14 Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated MAPREDUCE-279: Comment: was deleted (was: We're having a baby! Todd Papaioannou (p9u) is action head of Hadoop. Most line issues can continue to go to Amol, Kazi, Satish, Avik or Senthil as appropriate. I'll be back on roughly march 9th. CUSoon, E14 ) Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (MAPREDUCE-2329) RAID BlockFixer should exclude temporary files
RAID BlockFixer should exclude temporary files -- Key: MAPREDUCE-2329 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2329 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.2, 0.20.3 Reporter: Ramkumar Vadali Assignee: Ramkumar Vadali Priority: Minor RAID BlockFixer should exclude files matching the pattern ^/tmp/.* -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-2329) RAID BlockFixer should exclude temporary files
[ https://issues.apache.org/jira/browse/MAPREDUCE-2329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-2329: --- Component/s: contrib/raid RAID BlockFixer should exclude temporary files -- Key: MAPREDUCE-2329 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2329 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/raid Affects Versions: 0.20.2, 0.20.3 Reporter: Ramkumar Vadali Assignee: Ramkumar Vadali Priority: Minor RAID BlockFixer should exclude files matching the pattern ^/tmp/.* -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994928#comment-12994928 ] Arun C Murthy commented on MAPREDUCE-279: - h5. Proposal The fundamental idea of the re-factor is to divide the two major functions of the JobTracker, resource management and job scheduling/monitoring, into separate components: a generic resource scheduler and a per-job, user-defined component that manages the application execution. The new ResourceManager manages the global assignment of compute resources to applications and the per-application ApplicationMaster manages the application's scheduling and coordination. An application is either a single job in the classic MapReduce jobs or a DAG of such jobs. The ResourceManager and per-machine NodeManager server, which manages the user processes on that machine, form the computation fabric. The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks. The ResourceManager is a pure scheduler in the sense that it performs no monitoring or tracking of status for the application. Also, it offers no guarantees on restarting failed tasks either due to application failure or hardware failures. The ResourceManager performs its scheduling function based the resource requirements of the applications; each application has multiple resource request types that represent the resources required for containers. The resource requests include memory, CPU, disk, network etc. Note that this is a significant change from the current model of fixed-type slots in Hadoop MapReduce, which leads to significant negative impact on cluster utilization. The ResourceManager has a scheduler policy plug-in, which is responsible for partitioning the cluster resources among various queues, applications etc. Scheduler plug-ins can be based, for e.g., on the current CapacityScheduler and FairScheduler. The NodeManager is the per-machine framework agent who is responsible for launching the applications' containers, monitoring their resource usage (cpu, memory, disk, network) and reporting the same to the Scheduler. The per-application ApplicationMaster has the responsibility of negotiating appropriate resource containers from the Scheduler, launching tasks, tracking their status monitoring for progress, handling task-failures and recovering from saved state on an ResourceManager fail-over. Since downtime is more expensive at scale high-availability is built-in from the beginning via Apache ZooKeeper for the ResourceManager and HDFS checkpoint for the MapReduce ApplicationMaster. Security and multi-tenancy support is critical to support many users on the larger clusters. The new architecture will also increase innovation and agility by allowing for user-defined versions of MapReduce runtime. Support for generic resource requests will increase cluster utilization by removing artificial bottlenecks such as hard-partitioning of resources into map and reduce slots. We have a *prototype* we'd like to commit to a branch soon, where we look forward to feedback. From there on, we would love to collaborate to get it committed to trunk. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated MAPREDUCE-279: Comment: was deleted (was: h5. Proposal The fundamental idea of the re-factor is to divide the two major functions of the JobTracker, resource management and job scheduling/monitoring, into separate components: a generic resource scheduler and a per-job, user-defined component that manages the application execution. The new ResourceManager manages the global assignment of compute resources to applications and the per-application ApplicationMaster manages the application's scheduling and coordination. An application is either a single job in the classic MapReduce jobs or a DAG of such jobs. The ResourceManager and per-machine NodeManager server, which manages the user processes on that machine, form the computation fabric. The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks. The ResourceManager is a pure scheduler in the sense that it performs no monitoring or tracking of status for the application. Also, it offers no guarantees on restarting failed tasks either due to application failure or hardware failures. The ResourceManager performs its scheduling function based the resource requirements of the applications; each application has multiple resource request types that represent the resources required for containers. The resource requests include memory, CPU, disk, network etc. Note that this is a significant change from the current model of fixed-type slots in Hadoop MapReduce, which leads to significant negative impact on cluster utilization. The ResourceManager has a scheduler policy plug-in, which is responsible for partitioning the cluster resources among various queues, applications etc. Scheduler plug-ins can be based, for e.g., on the current CapacityScheduler and FairScheduler. The NodeManager is the per-machine framework agent who is responsible for launching the applications' containers, monitoring their resource usage (cpu, memory, disk, network) and reporting the same to the Scheduler. The per-application ApplicationMaster has the responsibility of negotiating appropriate resource containers from the Scheduler, launching tasks, tracking their status monitoring for progress, handling task-failures and recovering from saved state on an ResourceManager fail-over. Since downtime is more expensive at scale high-availability is built-in from the beginning via Apache ZooKeeper for the ResourceManager and HDFS checkpoint for the MapReduce ApplicationMaster. Security and multi-tenancy support is critical to support many users on the larger clusters. The new architecture will also increase innovation and agility by allowing for user-defined versions of MapReduce runtime. Support for generic resource requests will increase cluster utilization by removing artificial bottlenecks such as hard-partitioning of resources into map and reduce slots. We have a *prototype* we'd like to commit to a branch soon, where we look forward to feedback. From there on, we would love to collaborate to get it committed to trunk. ) Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-2314) configure files that are generated as part of the released tarball need to have executable bit set
[ https://issues.apache.org/jira/browse/MAPREDUCE-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994991#comment-12994991 ] Roman Shaposhnik commented on MAPREDUCE-2314: - $ ant veryclean test-patch -Dpatch.file=/home/rvs/MAPREDUCE-2314.patch -Dscratch.dir=/tmp/mapred/scratch -Dfindbugs.home=/home/rvs/findbugs-1.3.9 -Dforrest.home=/home/rvs/src/apache-forrest-0.8 . [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no new tests are needed for this patch. [exec] Also please list what manual steps were performed to verify this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system test framework. The patch passed system test framework compile. And given that it is a build change -1 tests included can be ignored configure files that are generated as part of the released tarball need to have executable bit set -- Key: MAPREDUCE-2314 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2314 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Attachments: MAPREDUCE-2314.patch Currently the configure files that are packaged in a tarball are -rw-rw-r-- -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-2314) configure files that are generated as part of the released tarball need to have executable bit set
[ https://issues.apache.org/jira/browse/MAPREDUCE-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994993#comment-12994993 ] Roman Shaposhnik commented on MAPREDUCE-2314: - To answer a chmod question -- I followed the already established pattern of not doing chmods in parallel. configure files that are generated as part of the released tarball need to have executable bit set -- Key: MAPREDUCE-2314 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2314 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Attachments: MAPREDUCE-2314.patch Currently the configure files that are packaged in a tarball are -rw-rw-r-- -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-2314) configure files that are generated as part of the released tarball need to have executable bit set
[ https://issues.apache.org/jira/browse/MAPREDUCE-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated MAPREDUCE-2314: Component/s: build Affects Version/s: 0.22.0 configure files that are generated as part of the released tarball need to have executable bit set -- Key: MAPREDUCE-2314 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2314 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build Affects Versions: 0.22.0 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Attachments: MAPREDUCE-2314.patch Currently the configure files that are packaged in a tarball are -rw-rw-r-- -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-2314) configure files that are generated as part of the released tarball need to have executable bit set
[ https://issues.apache.org/jira/browse/MAPREDUCE-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Boudnik updated MAPREDUCE-2314: -- Hadoop Flags: [Reviewed] +1 patch looks good. configure files that are generated as part of the released tarball need to have executable bit set -- Key: MAPREDUCE-2314 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2314 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build Affects Versions: 0.22.0 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Attachments: MAPREDUCE-2314.patch Currently the configure files that are packaged in a tarball are -rw-rw-r-- -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run
[ https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995003#comment-12995003 ] Konstantin Boudnik commented on MAPREDUCE-1783: --- Looks like when the change has been committed to 0.22 the CHANGE.txt file was updated improperly (added to IMPROVEMENT section instead of BUG FIXES) which cases problems now for downstream merges. Also, the description of the JIRA has been written to CHANGES.txt differently from what it say on the ticket ;( Please fix. Task Initialization should be delayed till when a job can be run Key: MAPREDUCE-1783 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/fair-share Affects Versions: 0.20.1 Reporter: Ramkumar Vadali Assignee: Ramkumar Vadali Fix For: 0.22.0, 0.23.0 Attachments: 0001-Pool-aware-job-initialization.patch, 0001-Pool-aware-job-initialization.patch.1, MAPREDUCE-1783.patch, submit-mapreduce-1783.patch The FairScheduler task scheduler uses PoolManager to impose limits on the number of jobs that can be running at a given time. However, jobs that are submitted are initiaiized immediately by EagerTaskInitializationListener by calling JobInProgress.initTasks. This causes the job split file to be read into memory. The split information is not needed until the number of running jobs is less than the maximum specified. If the amount of split information is large, this leads to unnecessary memory pressure on the Job Tracker. To ease memory pressure, FairScheduler can use another implementation of JobInProgressListener that is aware of PoolManager limits and can delay task initialization until the number of running jobs is below the maximum. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run
[ https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995004#comment-12995004 ] Konstantin Boudnik commented on MAPREDUCE-1783: --- I have fixed it. Task Initialization should be delayed till when a job can be run Key: MAPREDUCE-1783 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/fair-share Affects Versions: 0.20.1 Reporter: Ramkumar Vadali Assignee: Ramkumar Vadali Fix For: 0.22.0, 0.23.0 Attachments: 0001-Pool-aware-job-initialization.patch, 0001-Pool-aware-job-initialization.patch.1, MAPREDUCE-1783.patch, submit-mapreduce-1783.patch The FairScheduler task scheduler uses PoolManager to impose limits on the number of jobs that can be running at a given time. However, jobs that are submitted are initiaiized immediately by EagerTaskInitializationListener by calling JobInProgress.initTasks. This causes the job split file to be read into memory. The split information is not needed until the number of running jobs is less than the maximum specified. If the amount of split information is large, this leads to unnecessary memory pressure on the Job Tracker. To ease memory pressure, FairScheduler can use another implementation of JobInProgressListener that is aware of PoolManager limits and can delay task initialization until the number of running jobs is below the maximum. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-2314) configure files that are generated as part of the released tarball need to have executable bit set
[ https://issues.apache.org/jira/browse/MAPREDUCE-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Boudnik updated MAPREDUCE-2314: -- Resolution: Fixed Status: Resolved (was: Patch Available) I have committed this to trunk and branch-0.22. Thanks Roman! configure files that are generated as part of the released tarball need to have executable bit set -- Key: MAPREDUCE-2314 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2314 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build Affects Versions: 0.22.0 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Attachments: MAPREDUCE-2314.patch Currently the configure files that are packaged in a tarball are -rw-rw-r-- -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-2314) configure files that are generated as part of the released tarball need to have executable bit set
[ https://issues.apache.org/jira/browse/MAPREDUCE-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Boudnik updated MAPREDUCE-2314: -- Fix Version/s: 0.22.0 configure files that are generated as part of the released tarball need to have executable bit set -- Key: MAPREDUCE-2314 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2314 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build Affects Versions: 0.22.0 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Fix For: 0.22.0 Attachments: MAPREDUCE-2314.patch Currently the configure files that are packaged in a tarball are -rw-rw-r-- -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-2314) configure files that are generated as part of the released tarball need to have executable bit set
[ https://issues.apache.org/jira/browse/MAPREDUCE-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995010#comment-12995010 ] Hudson commented on MAPREDUCE-2314: --- Integrated in Hadoop-Mapreduce-trunk-Commit #611 (See [https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/611/]) MAPREDUCE-2314. configure files that are generated as part of the released tarball need to have executable bit set. Contributed by Roman Shaposhnik. configure files that are generated as part of the released tarball need to have executable bit set -- Key: MAPREDUCE-2314 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2314 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build Affects Versions: 0.22.0 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Fix For: 0.22.0 Attachments: MAPREDUCE-2314.patch Currently the configure files that are packaged in a tarball are -rw-rw-r-- -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (MAPREDUCE-2330) Forward port MapReduce server MXBeans
Forward port MapReduce server MXBeans - Key: MAPREDUCE-2330 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2330 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Luke Lu Fix For: 0.23.0 Some JMX classes e.g., JobTrackerMXBean and TaskTrackerMXBean in 0.20.100~ needs to be forward ported to 0.23 in some fashion, depending on how MapReduce 2.0 emerges. Note, similar item for HDFS, HDFS-1318 is already in 0.22. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-2254) Allow setting of end-of-record delimiter for TextInputFormat
[ https://issues.apache.org/jira/browse/MAPREDUCE-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995017#comment-12995017 ] Todd Lipcon commented on MAPREDUCE-2254: looks good except one minor nit - can you add the apache license to the new test file? Allow setting of end-of-record delimiter for TextInputFormat Key: MAPREDUCE-2254 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2254 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Ahmed Radwan Attachments: MAPREDUCE-2245.patch, MAPREDUCE-2254_r2.patch It will be useful to allow setting the end-of-record delimiter for TextInputFormat. The current implementation hardcodes '\n', '\r' or '\r\n' as the only possible record delimiters. This is a problem if users have embedded newlines in their data fields (which is pretty common). This is also a problem for other tools using this TextInputFormat (See for example: https://issues.apache.org/jira/browse/PIG-836 and https://issues.cloudera.org/browse/SQOOP-136). I have wrote a patch to address this issue. This patch allows users to specify any custom end-of-record delimiter using a new added configuration property. For backward compatibility, if this new configuration property is absent, then the same exact previous delimiters are used (i.e., '\n', '\r' or '\r\n'). -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-2254) Allow setting of end-of-record delimiter for TextInputFormat
[ https://issues.apache.org/jira/browse/MAPREDUCE-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Radwan updated MAPREDUCE-2254: Attachment: MAPREDUCE-2254_r3.patch Adding the Apache license header to the test file. Allow setting of end-of-record delimiter for TextInputFormat Key: MAPREDUCE-2254 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2254 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Ahmed Radwan Attachments: MAPREDUCE-2245.patch, MAPREDUCE-2254_r2.patch, MAPREDUCE-2254_r3.patch It will be useful to allow setting the end-of-record delimiter for TextInputFormat. The current implementation hardcodes '\n', '\r' or '\r\n' as the only possible record delimiters. This is a problem if users have embedded newlines in their data fields (which is pretty common). This is also a problem for other tools using this TextInputFormat (See for example: https://issues.apache.org/jira/browse/PIG-836 and https://issues.cloudera.org/browse/SQOOP-136). I have wrote a patch to address this issue. This patch allows users to specify any custom end-of-record delimiter using a new added configuration property. For backward compatibility, if this new configuration property is absent, then the same exact previous delimiters are used (i.e., '\n', '\r' or '\r\n'). -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-1921) IOExceptions should contain the filename of the broken input files
[ https://issues.apache.org/jira/browse/MAPREDUCE-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995126#comment-12995126 ] Krishna Ramachandran commented on MAPREDUCE-1921: - Tom You are right I will update the patch address the lapse. It has been a long time since I looked at this am glad to fix revive regards Krishna IOExceptions should contain the filename of the broken input files -- Key: MAPREDUCE-1921 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1921 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Krishna Ramachandran Assignee: Krishna Ramachandran Attachments: mapred-1921-1.patch, mapred-1921-3.patch, mapreduce-1921.patch If bzip or other decompression fails, the IOException does not contain the name of the broken file that caused the exception. It would be nice if such actions could be avoided in the future by having the name of the files that are broken spelled out in the exception. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (MAPREDUCE-2331) Add coverage of task graph servlet to fair scheduler system test
Add coverage of task graph servlet to fair scheduler system test Key: MAPREDUCE-2331 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2331 Project: Hadoop Map/Reduce Issue Type: Test Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.22.0 Would be useful to hit the TaskGraph servlet in the fair scheduler system test. This way, when run under JCarder, it will check for any lock inversions in this code. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-2116) optimize getTasksToKill to reduce JobTracker contention
[ https://issues.apache.org/jira/browse/MAPREDUCE-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kang Xiao updated MAPREDUCE-2116: - Attachment: 2116.4.patch optimize getTasksToKill to reduce JobTracker contention --- Key: MAPREDUCE-2116 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2116 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Reporter: Joydeep Sen Sarma Assignee: Joydeep Sen Sarma Attachments: 2116.1.patch, 2116.2.patch, 2116.3.patch, 2116.4.patch, getTaskToKill.JPG getTasksToKill shows up as one of the top routines holding the JT lock. Specifically, the translation from attemptid to tip is very expensive: at java.util.TreeMap.getEntry(TreeMap.java:328) at java.util.TreeMap.get(TreeMap.java:255) at org.apache.hadoop.mapred.TaskInProgress.shouldClose(TaskInProgress.java:500) at org.apache.hadoop.mapred.JobTracker.getTasksToKill(JobTracker.java:3464) locked 0x2aab6ebb6640 (a org.apache.hadoop.mapred.JobTracker) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:3181) this seems like an avoidable expense since the tip for a given attempt is fixed (and one should not need a map lookup to find the association). on a different note - not clear to me why TreeMaps are in use here (i didn't find any iteration over these maps). any background info on why things are arranged the way they are would be useful. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-2116) optimize getTasksToKill to reduce JobTracker contention
[ https://issues.apache.org/jira/browse/MAPREDUCE-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995176#comment-12995176 ] Kang Xiao commented on MAPREDUCE-2116: -- I am sorry for late response. A patch attached for the suggestion. It just uses the same as getTasksToSave() to iterator only on running tasks on the tasktracker. optimize getTasksToKill to reduce JobTracker contention --- Key: MAPREDUCE-2116 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2116 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Reporter: Joydeep Sen Sarma Assignee: Joydeep Sen Sarma Attachments: 2116.1.patch, 2116.2.patch, 2116.3.patch, 2116.4.patch, getTaskToKill.JPG getTasksToKill shows up as one of the top routines holding the JT lock. Specifically, the translation from attemptid to tip is very expensive: at java.util.TreeMap.getEntry(TreeMap.java:328) at java.util.TreeMap.get(TreeMap.java:255) at org.apache.hadoop.mapred.TaskInProgress.shouldClose(TaskInProgress.java:500) at org.apache.hadoop.mapred.JobTracker.getTasksToKill(JobTracker.java:3464) locked 0x2aab6ebb6640 (a org.apache.hadoop.mapred.JobTracker) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:3181) this seems like an avoidable expense since the tip for a given attempt is fixed (and one should not need a map lookup to find the association). on a different note - not clear to me why TreeMaps are in use here (i didn't find any iteration over these maps). any background info on why things are arranged the way they are would be useful. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (MAPREDUCE-2332) Improve error messages when MR dirs on local FS have bad ownership
Improve error messages when MR dirs on local FS have bad ownership -- Key: MAPREDUCE-2332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2332 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: mr-2332.txt A common source of user difficulty on a secure cluster is understanding which paths should be owned by which users. The task log directory in particular is often missed, since it has to be owned by mapred but may be inside a logs dir which has different ownership. Right now the user has to spelunk in the code to understand the exception they get if this dir has bad ownership. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-2332) Improve error messages when MR dirs on local FS have bad ownership
[ https://issues.apache.org/jira/browse/MAPREDUCE-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated MAPREDUCE-2332: --- Attachment: mr-2332.txt Something like this? Improve error messages when MR dirs on local FS have bad ownership -- Key: MAPREDUCE-2332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2332 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: mr-2332.txt A common source of user difficulty on a secure cluster is understanding which paths should be owned by which users. The task log directory in particular is often missed, since it has to be owned by mapred but may be inside a logs dir which has different ownership. Right now the user has to spelunk in the code to understand the exception they get if this dir has bad ownership. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira