[jira] [Resolved] (MAPREDUCE-3868) Reenable Raid

2012-06-18 Thread Scott Chen (JIRA)
[ https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen resolved MAPREDUCE-3868. --- Resolution: Fixed Hadoop Flags: Reviewed I just committed this. Thanks, Weiyan

[jira] [Created] (MAPREDUCE-3868) Reenable Raid

2012-02-15 Thread Scott Chen (Created) (JIRA)
Reenable Raid - Key: MAPREDUCE-3868 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/raid Reporter: Scott Chen

[jira] [Resolved] (MAPREDUCE-2198) Allow FairScheduler to control the number of slots on each TaskTracker

2011-09-14 Thread Scott Chen (JIRA)
[ https://issues.apache.org/jira/browse/MAPREDUCE-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen resolved MAPREDUCE-2198. --- Resolution: Won't Fix > Allow FairScheduler to control the number of slots

[jira] [Resolved] (MAPREDUCE-2108) Allow TaskScheduler manage number slots on TaskTrackers

2011-09-14 Thread Scott Chen (JIRA)
[ https://issues.apache.org/jira/browse/MAPREDUCE-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen resolved MAPREDUCE-2108. --- Resolution: Won't Fix > Allow TaskScheduler manage number slots on Task

[jira] [Created] (MAPREDUCE-2601) Add a filter text box to FairSchedulerServlet page

2011-06-16 Thread Scott Chen (JIRA)
Components: contrib/fair-share Reporter: Scott Chen Priority: Minor It will be useful if we can filter pool in the fairscheduler UI page. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-2579) The parity path is not initial correctly in BlockPlacementPolicyRaid

2011-06-08 Thread Scott Chen (JIRA)
Issue Type: Bug Components: contrib/raid Reporter: Scott Chen Assignee: Scott Chen BlockPlacementPolicyRaid.initialize() initialize the parity paths. It uses Path.makeQualified() that requires to contact namenode but namenode is not up yet. -- This message is

[jira] [Created] (MAPREDUCE-2442) Add a JSP page for RaidNode

2011-04-17 Thread Scott Chen (JIRA)
: Scott Chen Assignee: Scott Chen -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-1861) Raid should rearrange the replicas while raiding

2011-03-31 Thread Scott Chen (JIRA)
[ https://issues.apache.org/jira/browse/MAPREDUCE-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen resolved MAPREDUCE-1861. --- Resolution: Won't Fix We found that this approach puts some load on namenode. We wi

[jira] Created: (MAPREDUCE-2348) TestSimulator* failed on trunk

2011-02-28 Thread Scott Chen (JIRA)
: Scott Chen Priority: Blocker All Failed Tests {code} org.apache.hadoop.mapred.TestSimulatorJobTracker.testTrackerInteraction org.apache.hadoop.mapred.TestSimulatorDeterministicReplay.testMain org.apache.hadoop.mapred.TestSimulatorEndToEnd.testMain

[jira] Created: (MAPREDUCE-2302) Add static factory methods in GaloisField

2011-02-03 Thread Scott Chen (JIRA)
: contrib/raid Reporter: Scott Chen Assignee: Scott Chen GaloisField is immutable and should be kept reuse after creation to avoid redundant calculation of the multiplication and division tables. -- This message is automatically generated by JIRA. - For more information on JIRA

[jira] Created: (MAPREDUCE-2292) Provide a shell interface for querying the status of FairScheduler

2011-01-31 Thread Scott Chen (JIRA)
Issue Type: Improvement Components: contrib/fair-share Reporter: Scott Chen Priority: Minor It will be useful if we have some shell interface to obtain some status in FairScheduler. Just like we can use bin/hadoop job [-list|-list-trackers|-kill-task|...] to

[jira] Resolved: (MAPREDUCE-2283) TestBlockFixer hangs initializing MiniMRCluster

2011-01-27 Thread Scott Chen (JIRA)
[ https://issues.apache.org/jira/browse/MAPREDUCE-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen resolved MAPREDUCE-2283. --- Resolution: Fixed I just committed this to 0.22. Thanks for the reminder, Konstantin

[jira] Resolved: (MAPREDUCE-2283) TestBlockFixer hangs initializing MiniMRCluster

2011-01-27 Thread Scott Chen (JIRA)
[ https://issues.apache.org/jira/browse/MAPREDUCE-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen resolved MAPREDUCE-2283. --- Resolution: Fixed Assignee: Ramkumar Vadali Hadoop Flags: [Reviewed] I

[jira] Created: (MAPREDUCE-2239) BlockPlacementPolicyRaid should call getBlockLocations only when necessary

2011-01-03 Thread Scott Chen (JIRA)
/Reduce Issue Type: Improvement Components: contrib/raid Affects Versions: 0.23.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.23.0 Currently BlockPlacementPolicyRaid calls getBlockLocations for every chooseTarget(). This puts

Re: Review Request: MAPREDUCE-2224. Fix sync bugs in JvmManager

2010-12-23 Thread Scott Chen
Todd Lipcon wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/190/ > --- > > (Updated 2010-12-22 12:19:22) > > &g

[jira] Resolved: (MAPREDUCE-2212) MapTask and ReduceTask should only compress/decompress the final map output file

2010-12-16 Thread Scott Chen (JIRA)
[ https://issues.apache.org/jira/browse/MAPREDUCE-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen resolved MAPREDUCE-2212. --- Resolution: Won't Fix I am closing this now because I think there is no much benef

[jira] Resolved: (MAPREDUCE-2205) FairScheduler should not re-schedule jobs that have just been preempted

2010-12-16 Thread Scott Chen (JIRA)
[ https://issues.apache.org/jira/browse/MAPREDUCE-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen resolved MAPREDUCE-2205. --- Resolution: Not A Problem > FairScheduler should not re-schedule jobs that have j

[jira] Created: (MAPREDUCE-2217) The expire launching task should cover the UNASSIGNED task

2010-12-13 Thread Scott Chen (JIRA)
: Improvement Components: jobtracker Affects Versions: 0.23.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.23.0 The ExpireLaunchingTask thread kills the task that are scheduled but not responded. Currently if a task is scheduled on tasktracker

Review Request: Raid should rearrange the replicas while raiding

2010-12-09 Thread Scott Chen
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/160/ --- Review request for hadoop-mapreduce, Dhruba Borthakur and Ramkumar Vadali. Summar

Review Request: BlockPlacement policy for RAID

2010-12-09 Thread Scott Chen
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/159/ --- Review request for hadoop-mapreduce, Dhruba Borthakur and Ramkumar Vadali. Summar

[jira] Created: (MAPREDUCE-2212) MapTask and ReduceTask should only compress/decompress the final map output file

2010-12-07 Thread Scott Chen (JIRA)
: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.23.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.23.0 Currently if we set mapred.map.output.compression.codec 1. MapTask will compress every spill

[jira] Created: (MAPREDUCE-2207) Task-cleanup task should not be scheduled on the node that the task just failed

2010-11-30 Thread Scott Chen (JIRA)
: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Affects Versions: 0.23.0 Reporter: Scott Chen Fix For: 0.23.0 Currently the task-cleanup task always go to the same node that the task just failed. There is a higher chance that it hits

[jira] Created: (MAPREDUCE-2206) The task-cleanup tasks should be optional

2010-11-29 Thread Scott Chen (JIRA)
: jobtracker Affects Versions: 0.23.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.23.0 For job does not use OutputCommitter.abort(), this should be able to turn off. This improves the latency of the job because failed tasks are often the bottleneck of

[jira] Created: (MAPREDUCE-2198) Allow FairScheduler to control the number of slots on each TaskTracker

2010-11-22 Thread Scott Chen (JIRA)
Issue Type: New Feature Components: contrib/fair-share Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 We can set the number of slots on the TaskTracker to be high and let FairScheduler handles the slots

[jira] Resolved: (MAPREDUCE-1704) Parity files that are outdated or nonexistent should be immediately disregarded

2010-11-07 Thread Scott Chen (JIRA)
[ https://issues.apache.org/jira/browse/MAPREDUCE-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen resolved MAPREDUCE-1704. --- Resolution: Invalid It's already been fixed. > Parity files that are out

[jira] Created: (MAPREDUCE-2131) Aggregate JobCounters and TaskCounters in a background thread

2010-10-13 Thread Scott Chen (JIRA)
: Improvement Components: jobtracker Affects Versions: 0.22.0 Reporter: Scott Chen Fix For: 0.22.0 JobTracker.getJobCounters() aggregates the counters when getting requested. It may consume lots of CPU if the request is too often. It may be good if we

[jira] Created: (MAPREDUCE-2125) Put map-reduce framework counters to JobTrackerMetricsInst

2010-10-11 Thread Scott Chen (JIRA)
: Improvement Components: jobtracker Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 We have lots of useful information in the framework counters including #spills, filesystem read and write. It will be nice to put

[jira] Created: (MAPREDUCE-2124) Add counters for measuring time spent in three different phases in reducers

2010-10-11 Thread Scott Chen (JIRA)
/Reduce Issue Type: Improvement Components: jobtracker Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Priority: Minor Fix For: 0.22.0 We currently have SLOTS_MILLIS_REDUCES which measures the total slot

[jira] Created: (MAPREDUCE-2108) Allow TaskScheduler manage number slots on TaskTrackers

2010-10-04 Thread Scott Chen (JIRA)
Components: contrib/capacity-sched, contrib/fair-share Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Currently the map slots and reduce slots are managed by TaskTracker configuration. To change the task

Re: welcome Scott Chen as a Hadoop Map-Reduce committer

2010-08-27 Thread Scott Chen
Thank you very much. It’s my honor. Scott On 8/27/10 5:20 PM, "Dhruba Borthakur" wrote: The Hadoop PMC has voted to make Scott Chen a committer for the Apache Hadoop Mapreduce project. Welcome Scott! Scott: please file your ICLA (http://www.apache.org/licenses/icla.txt) thanks, dhruba

[jira] Created: (MAPREDUCE-2026) JobTracker.getJobCounters() should not hold JobTracker lock while calling JobInProgress.getCounters()

2010-08-20 Thread Scott Chen (JIRA)
/browse/MAPREDUCE-2026 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 JobTracker.getJobCounter() will lock JobTracker and call JobInProgress.getCounters

[jira] Resolved: (MAPREDUCE-1995) FairScheduler can wait for JobInProgress lock while holding JobTracker lock

2010-08-04 Thread Scott Chen (JIRA)
[ https://issues.apache.org/jira/browse/MAPREDUCE-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen resolved MAPREDUCE-1995. --- Resolution: Invalid After looking at it more carefully, I think this may not be a

[jira] Created: (MAPREDUCE-1995) FairScheduler can wait for JobInProgress lock while holding JobTracker lock

2010-08-04 Thread Scott Chen (JIRA)
/Reduce Issue Type: Improvement Components: contrib/fair-share Affects Versions: 0.21.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 {code} JobInProgress.getTaskInProgress()(Locks JobInProgress) LocalityLevel.fromTask

[jira] Created: (MAPREDUCE-1974) FairScheduler can preempt the same task many times

2010-07-27 Thread Scott Chen (JIRA)
Reporter: Scott Chen Assignee: Scott Chen In FairScheduler.preemptTasks(), tasks are collected from JobInProgress.runningMapCache. But tasks repeat multiple times in JobInProgress.runningMapCache (on rack, node and cluster). This makes FairScheduler preempt the same task many times

[jira] Created: (MAPREDUCE-1970) Reed-Solomon code implementation to be used in raid

2010-07-26 Thread Scott Chen (JIRA)
Components: contrib/raid Reporter: Scott Chen Assignee: Scott Chen A Reed-Solomon erasure code implementation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.

[jira] Created: (MAPREDUCE-1969) Allow raid to use Reed-Solomon erasure codes

2010-07-26 Thread Scott Chen (JIRA)
: contrib/raid Reporter: Scott Chen Fix For: 0.22.0 Currently raid uses one parity block per stripe which corrects one missing block on one stripe. Using Reed-Solomon code, we can add any number of parity blocks to tolerate more missing blocks. This way we can get a good

[jira] Created: (MAPREDUCE-1950) Jetty Acceptor can stuck TaskTracker forever

2010-07-19 Thread Scott Chen (JIRA)
: tasktracker Affects Versions: 0.20.1 Reporter: Scott Chen We have observed some TaskTrackers keep getting blacklisted because of "Too many fetch-failure" and the logs are full of the following exception. There can be lots of these exceptions in one millisecond. {code} 2010-0

[jira] Created: (MAPREDUCE-1910) Allow raid policy to specify a parent policy

2010-07-01 Thread Scott Chen (JIRA)
: contrib/raid Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Priority: Minor Fix For: 0.22.0 We encountered the problem that there are lots of redundancy in our raid.xml file. Most of the policy shares the same properties

[jira] Created: (MAPREDUCE-1903) Allow different slowTaskThreshold for mappers and reducers

2010-06-30 Thread Scott Chen (JIRA)
: Improvement Components: jobtracker Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 We have been running the new speculative logic in HADOOP-2141 done by Andy in our production cluster. One thing that we observed is

[jira] Resolved: (MAPREDUCE-1361) In the pools with minimum slots, new job will always receive slots even if the minimum slots limit has been fulfilled

2010-06-16 Thread Scott Chen (JIRA)
[ https://issues.apache.org/jira/browse/MAPREDUCE-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen resolved MAPREDUCE-1361. --- Resolution: Won't Fix > In the pools with minimum slots, new job will always

[jira] Created: (MAPREDUCE-1861) Raid should rearrange the replicas while raiding

2010-06-11 Thread Scott Chen (JIRA)
Components: contrib/raid Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Raided file introduce extra dependencies on the blocks on the same stripe. Therefore we need a new way to place the blocks. It is desirable that raided

[jira] Created: (MAPREDUCE-1848) Put number of speculative, data local, rack local tasks in JobTracker metrics

2010-06-09 Thread Scott Chen (JIRA)
Map/Reduce Issue Type: Improvement Components: jobtracker Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 It will be nice that we can collect these information in JobTracker metrics -- This message is

[jira] Created: (MAPREDUCE-1845) FairScheduler.tasksToPeempt() can return negative number

2010-06-07 Thread Scott Chen (JIRA)
Components: contrib/fair-share Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Priority: Minor Fix For: 0.22.0 This method can return negative number. This will cause the preemption to under-preempt. The bug was

[jira] Created: (MAPREDUCE-1837) Raid should store the metadata in HDFS

2010-06-02 Thread Scott Chen (JIRA)
/raid Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Currently if you change the stripe length in the raid policy. The existing raided files cannot be recovered. Also in the future if we want to upgrade to a better erasure

[jira] Created: (MAPREDUCE-1831) Delete the replica on the most concentrated node when raiding file

2010-06-01 Thread Scott Chen (JIRA)
Issue Type: Improvement Components: contrib/raid Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 In raid, it is good to have the blocks on the same stripe located on different machine. This way when one machine

[jira] Created: (MAPREDUCE-1829) JobInProgress.findSpeculativeTask should use min() to find the candidate instead of sort()

2010-06-01 Thread Scott Chen (JIRA)
Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Priority: Minor Fix For: 0.22.0 findSpeculativeTask needs only one

[jira] Created: (MAPREDUCE-1823) Reduce the number of calls of HarFileSystem.getFileStatus in RaidNode

2010-05-28 Thread Scott Chen (JIRA)
Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 RaidNode makes lots of calls of HarFileSystem.getFileStatus. This method fetches information from DataNode so it is slow. It becomes the

[jira] Resolved: (MAPREDUCE-1756) FairScheduler may assign tasks over the TaskTracker limit

2010-05-06 Thread Scott Chen (JIRA)
[ https://issues.apache.org/jira/browse/MAPREDUCE-1756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen resolved MAPREDUCE-1756. --- Resolution: Invalid Sorry, I made a mistake. CapBasedLoadManager.canAssign{Map,Reduce

[jira] Created: (MAPREDUCE-1764) FairScheduler locality delay may put heavy pressure on Jobtracker

2010-05-06 Thread Scott Chen (JIRA)
Issue Type: Bug Reporter: Scott Chen Assignee: Scott Chen FairScheduler locality delay feature holds the scheduling of jobs until it gets good locality. This greatly improves the locality of the tasks. Reduce the cost of traffic. We have observed the following problem on

[jira] Created: (MAPREDUCE-1762) Add a setValue() method in Counter

2010-05-06 Thread Scott Chen (JIRA)
Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Counters are very useful because of the logging and transmitting are already there. It is very convenient to transmit and store numbers. But currently Counter only has an increment() method. It will be nice if

[jira] Created: (MAPREDUCE-1761) FairScheduler should allow separate configuration of node and rack locality wait time

2010-05-06 Thread Scott Chen (JIRA)
Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 It would be nice that we can separately assign rack locality wait time. In our use case, we would set node

[jira] Created: (MAPREDUCE-1756) FairScheduler may assign tasks over the TaskTracker limit

2010-05-05 Thread Scott Chen (JIRA)
Components: contrib/fair-share Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 FairScheduler may assign tasks over the TaskTracker limit. The over assigned task will wait on the TaskTracker in the state of

[jira] Created: (MAPREDUCE-1739) Collecting CPU and memory usage for MapReduce jobs

2010-04-28 Thread Scott Chen (JIRA)
Reporter: Scott Chen Assignee: Scott Chen MAPREDUCE-220 collects CPU and memory usage for each task. We can aggregate them to get the information per job. Such information can be used for scheduling, profiling or charging the users based on the resource they consumed. Here are

[jira] Created: (MAPREDUCE-1608) Allow users to do speculative execution of a task manually

2010-03-18 Thread Scott Chen (JIRA)
Feature Reporter: Scott Chen Speculative execution improves the latency of the job. Sometimes the job has few very slow reducers. Spending a little more resource on speculative tasks can improve the latency a lot. It will be nice that the users can manually select one task and force

[jira] Created: (MAPREDUCE-1568) TrackerDistributedCacheManager should do deleteLocalPath asynchronously

2010-03-05 Thread Scott Chen (JIRA)
Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 TrackerDistributedCacheManager.deleteCache() has been improved: MAPREDUCE-1302 makes TrackerDistributedCacheManager rename the caches in the

[jira] Created: (MAPREDUCE-1546) jobdetails.jsp and taskdetials.jsp should show links to the corresponding jobdetailshistory.jsp and taskdetailshistory.jsp if the task/job is gone

2010-03-01 Thread Scott Chen (JIRA)
-- Key: MAPREDUCE-1546 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1546 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Priority

[jira] Created: (MAPREDUCE-1538) TrackerDistributedCacheManager can fail because the number of subdirectories reaches system limit

2010-02-25 Thread Scott Chen (JIRA)
/MAPREDUCE-1538 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 TrackerDistributedCacheManager deletes the cached files when

[jira] Created: (MAPREDUCE-1463) Reducer should start faster for smaller jobs

2010-02-05 Thread Scott Chen (JIRA)
: contrib/fair-share Reporter: Scott Chen Assignee: Scott Chen Our users often complain about the slowness of smaller ad-hoc jobs. The overhead to wait for the reducers to start in this case is significant. It will be good if we can start the reducer sooner in this case

[jira] Created: (MAPREDUCE-1382) MRAsyncDiscService should tolerate missing local.dir

2010-01-15 Thread Scott Chen (JIRA)
Reporter: Scott Chen Assignee: Zheng Shao Currently when some of the local.dir do not exist, MRAsyncDiscService will fail. It should only fail when all directories don't work. -- This message is automatically generated by JIRA. - You can reply to this email to add a co

[jira] Created: (MAPREDUCE-1361) In the pools with minimum slots, new job will always receive slots even if the minimum slots limit has been fulfilled

2010-01-06 Thread Scott Chen (JIRA)
: https://issues.apache.org/jira/browse/MAPREDUCE-1361 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/fair-share Affects Versions: 0.20.1 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.20.1 In 0.20

[jira] Resolved: (MAPREDUCE-1345) JobTracker is slowed down because it forks subprocesses to do a df command

2009-12-29 Thread Scott Chen (JIRA)
[ https://issues.apache.org/jira/browse/MAPREDUCE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen resolved MAPREDUCE-1345. --- Resolution: Duplicate > JobTracker is slowed down because it forks subprocesses to d

[jira] Created: (MAPREDUCE-1265) Include jobId and hostname in the task attempt error log

2009-12-03 Thread Scott Chen (JIRA)
: Improvement Reporter: Scott Chen Assignee: Scott Chen Priority: Trivial When task attempt receive an error, TaskInProgress will log the task attempt id and diagnosis string in the JobTracker log. Ex: 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress

[jira] Created: (MAPREDUCE-1243) ant compile-test in contrib/streaming fails

2009-11-25 Thread Scott Chen (JIRA)
/streaming Reporter: Scott Chen Compile fails. It seems that hdfs-test jar file cannot be found. compile-test: [echo] contrib: streaming [javac] Compiling 44 source files to /home/schen/asf-mapred2/build/contrib/streaming/test [javac] /home/schen/asf-mapred2/src/contrib

[jira] Created: (MAPREDUCE-1218) Collecting cpu and memory usage for TaskTrackers

2009-11-17 Thread Scott Chen (JIRA)
Environment: linux Reporter: Scott Chen Assignee: Scott Chen The information can be used for resource aware scheduling. Note that this is related to MAPREDUCE-220. There the per task resource information is collected. This one collects the per machine information. -- This

[jira] Created: (MAPREDUCE-1201) Make ProcfsBasedProcessTree collect CPU usage information

2009-11-10 Thread Scott Chen (JIRA)
-task Reporter: Scott Chen Assignee: Scott Chen This information can be reported back to jobtracker to help profiling jobs and scheduling tasks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.

[jira] Created: (MAPREDUCE-1198) Alternatively schedule different types of tasks in fair share scheduler

2009-11-09 Thread Scott Chen (JIRA)
Issue Type: Improvement Components: contrib/fair-share Reporter: Scott Chen Assignee: Scott Chen Matei has mentioned in MAPREDUCE-961 that the current scheduler will first try to launch map tasks until canLaunthTask() returns false then look for reduce

[jira] Resolved: (MAPREDUCE-1181) Enforce RSS memory limit in TaskMemoryManagerThread

2009-11-08 Thread Scott Chen (JIRA)
[ https://issues.apache.org/jira/browse/MAPREDUCE-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen resolved MAPREDUCE-1181. --- Resolution: Invalid > Enforce RSS memory limit in TaskMemoryManagerThr

[jira] Created: (MAPREDUCE-1181) Enforce RSS memory limit in TaskMemoryManagerThread

2009-11-03 Thread Scott Chen (JIRA)
Components: tasktracker Affects Versions: 0.20.1 Reporter: Scott Chen Fix For: 0.20.1 TaskMemoryManagerThread will periodically check the rss memory usage of every task. If the memory usage exceeds the specified threshold, the task will be killed. Also if the

[jira] Created: (MAPREDUCE-1167) Make ProcfsBasedProcessTree collect rss memory information

2009-10-29 Thread Scott Chen (JIRA)
Feature Components: tasktracker Affects Versions: 0.20.1 Reporter: Scott Chen Right now ProcfsBasedProcess collects only virtual memory. We can make it collect rss memory as well. Later we can use rss in TaskMemoryManagerThread to obtain better memory management

[jira] Created: (MAPREDUCE-1129) Assign multiple Map and Reduce tasks in Fairscheduler

2009-10-21 Thread Scott Chen (JIRA)
Components: contrib/fair-share Affects Versions: 0.20.1 Reporter: Scott Chen In Hadoop-0.20, the period of heartbeat becomes much longer. Fairscheduler assigns at most one Map and one Reduce task per heartbeat. This makes the cluster become very inefficient. Often time only