[jira] Commented: (HADOOP-6105) Provide a way to automatically handle backward compatibility of deprecated keys

2009-06-25 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723993#action_12723993 ] Arun C Murthy commented on HADOOP-6105: --- +1 for this direction. > Provide a

[jira] Commented: (HADOOP-5985) A single slow (but not dead) map TaskTracker impedes MapReduce progress

2009-06-21 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722454#action_12722454 ] Arun C Murthy commented on HADOOP-5985: --- Sorry for jumping in late, I som

[jira] Updated: (HADOOP-5985) A single slow (but not dead) map TaskTracker impedes MapReduce progress

2009-06-21 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5985: -- Component/s: mapred > A single slow (but not dead) map TaskTracker impedes MapReduce progr

[jira] Commented: (HADOOP-5964) Fix the 'cluster drain' problem in the Capacity Scheduler wrt High RAM Jobs

2009-06-19 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721676#action_12721676 ] Arun C Murthy commented on HADOOP-5964: --- Forgot to add that I've m

[jira] Updated: (HADOOP-5964) Fix the 'cluster drain' problem in the Capacity Scheduler wrt High RAM Jobs

2009-06-19 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5964: -- Attachment: HADOOP-5964_8_20090618.patch Thanks for the review Hemanth - as you pointed out

[jira] Created: (HADOOP-6090) GridMix is broke after upgrading random(text)writer to newer mapreduce apis

2009-06-18 Thread Arun C Murthy (JIRA)
Issue Type: Bug Components: benchmarks Affects Versions: 0.21.0 Reporter: Arun C Murthy Fix For: 0.21.0 GridMix data generation scripts need to use the newer mapreduce api. -- This message is automatically generated by JIRA. - You can reply to this

[jira] Commented: (HADOOP-5964) Fix the 'cluster drain' problem in the Capacity Scheduler wrt High RAM Jobs

2009-06-18 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721114#action_12721114 ] Arun C Murthy commented on HADOOP-5964: --- Some notes about this patch: #

[jira] Updated: (HADOOP-5964) Fix the 'cluster drain' problem in the Capacity Scheduler wrt High RAM Jobs

2009-06-18 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5964: -- Attachment: HADOOP-5964_7_20090618.patch Some bug fixes and added counters to track how long

[jira] Updated: (HADOOP-5964) Fix the 'cluster drain' problem in the Capacity Scheduler wrt High RAM Jobs

2009-06-16 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5964: -- Attachment: HADOOP-5964_6_20090617.patch Reasonably well tested patch, appreciate any

[jira] Commented: (HADOOP-5795) Add a bulk FIleSystem.getFileBlockLocations

2009-06-16 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720459#action_12720459 ] Arun C Murthy commented on HADOOP-5795: --- Quick note: making the length manda

[jira] Updated: (HADOOP-5964) Fix the 'cluster drain' problem in the Capacity Scheduler wrt High RAM Jobs

2009-06-15 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5964: -- Attachment: HADOOP-5964_4_20090615.patch After much consideration I decided to revert back to

[jira] Created: (HADOOP-6014) Improvements to Global Black-listing of TaskTrackers

2009-06-11 Thread Arun C Murthy (JIRA)
Components: mapred Affects Versions: 0.20.0 Reporter: Arun C Murthy Fix For: 0.21.0 HADOOP-4305 added a global black-list of tasktrackers. We saw a scenario on one of our clusters where a few jobs caused a lot of tasktrackers to immediately be blacklisted. This was caused

[jira] Updated: (HADOOP-5964) Fix the 'cluster drain' problem in the Capacity Scheduler wrt High RAM Jobs

2009-06-08 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5964: -- Attachment: HADOOP-5964_2_20090609.patch Updated patch. > Fix the 'cluster drain

[jira] Updated: (HADOOP-5964) Fix the 'cluster drain' problem in the Capacity Scheduler wrt High RAM Jobs

2009-06-08 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5964: -- Attachment: HADOOP-5964_1_20090608.patch Reasonably well tested patch for review... this

[jira] Commented: (HADOOP-5090) The capacity-scheduler should assign multiple tasks per heartbeat

2009-06-05 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716723#action_12716723 ] Arun C Murthy commented on HADOOP-5090: --- Also, I *really* don't think

[jira] Assigned: (HADOOP-5964) Fix the 'cluster drain' problem in the Capacity Scheduler wrt High RAM Jobs

2009-06-05 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy reassigned HADOOP-5964: - Assignee: Arun C Murthy > Fix the 'cluster drain' problem in the Capacity

[jira] Commented: (HADOOP-5090) The capacity-scheduler should assign multiple tasks per heartbeat

2009-06-05 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716526#action_12716526 ] Arun C Murthy commented on HADOOP-5090: --- I'd strongly urge *against*

[jira] Commented: (HADOOP-5884) Capacity scheduler should account high memory jobs as using more capacity of the queue

2009-06-03 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715857#action_12715857 ] Arun C Murthy commented on HADOOP-5884: --- I'm just proposing we add #slo

[jira] Updated: (HADOOP-5964) Fix the 'cluster drain' problem in the Capacity Scheduler wrt High RAM Jobs

2009-06-02 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5964: -- Attachment: HADOOP-5964_0_20090602.patch Very early patch. I haven't introdu

[jira] Commented: (HADOOP-5964) Fix the 'cluster drain' problem in the Capacity Scheduler wrt High RAM Jobs

2009-06-02 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715768#action_12715768 ] Arun C Murthy commented on HADOOP-5964: --- A _much_ better model is for the sched

[jira] Created: (HADOOP-5964) Fix the 'cluster drain' problem in the Capacity Scheduler wrt High RAM Jobs

2009-06-02 Thread Arun C Murthy (JIRA)
adoop Core Issue Type: Bug Components: contrib/capacity-sched Affects Versions: 0.20.0 Reporter: Arun C Murthy Fix For: 0.21.0 When a HighRAMJob turns up at the head of the queue, the current implementation of support for HighRAMJobs in the Capacity Sch

[jira] Commented: (HADOOP-5884) Capacity scheduler should account high memory jobs as using more capacity of the queue

2009-06-01 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715281#action_12715281 ] Arun C Murthy commented on HADOOP-5884: --- Long term - we really should

[jira] Commented: (HADOOP-5884) Capacity scheduler should account high memory jobs as using more capacity of the queue

2009-06-01 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715275#action_12715275 ] Arun C Murthy commented on HADOOP-5884: --- Can we also add the number of slots to

[jira] Commented: (HADOOP-5664) Use of ReentrantLock.lock() in MapOutputBuffer takes up too much cpu time

2009-05-28 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714243#action_12714243 ] Arun C Murthy commented on HADOOP-5664: --- +1 > Use of ReentrantLock.lo

[jira] Resolved: (HADOOP-5885) Error occurred during initialization of VM

2009-05-28 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy resolved HADOOP-5885. --- Resolution: Invalid > Error occurred during initialization of

[jira] Commented: (HADOOP-5885) Error occurred during initialization of VM

2009-05-27 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713644#action_12713644 ] Arun C Murthy commented on HADOOP-5885: --- This looks like a bad misconfigura

Re: [VOTE] freeze date for Hadoop 0.21

2009-05-27 Thread Arun C Murthy
On May 26, 2009, at 5:11 PM, Owen O'Malley wrote: I'd like to propose a code freeze and branch date of 7/31. One major exception is for HDFS file append, which I think we need in 0.21 and will take longer than that. +1 Arun

[jira] Commented: (HADOOP-5876) Shuffling information logged to userlogs/attempt_####_###_r_###_#/syslogs

2009-05-20 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711277#action_12711277 ] Arun C Murthy commented on HADOOP-5876: --- What kind of shuffle information are

[jira] Created: (HADOOP-5870) Implement a memory-to-memory sort in the map task

2009-05-19 Thread Arun C Murthy (JIRA)
: mapred Reporter: Arun C Murthy The motivation is similar to HADOOP-5831... Currently we collect map-outputs in the sort buffer (io.sort.mb) which we eventually sort and spill to disk. For latency-sensitive applications with sufficient memory, e.g. terasort, we could do better by

[jira] Commented: (HADOOP-5846) Log job history events to a common dump file

2009-05-15 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709765#action_12709765 ] Arun C Murthy commented on HADOOP-5846: --- Haven't we had problems with jo

[jira] Commented: (HADOOP-5846) Log job history events to a common dump file

2009-05-15 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709761#action_12709761 ] Arun C Murthy commented on HADOOP-5846: --- Is the proposal to add another

[jira] Created: (HADOOP-5831) Implement memory-to-memory merge in the reduce

2009-05-14 Thread Arun C Murthy (JIRA)
Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.21.0 HADOOP-3446 fixed the reduce to not flush the in-memory shuffled map-outputs before feeding to the reduce. However for latency-sensitive applications with lots of memory like the terasort this

[jira] Created: (HADOOP-5830) Reuse output collectors across maps running on the same jvm

2009-05-14 Thread Arun C Murthy (JIRA)
Components: mapred Reporter: Arun C Murthy We have evidence that cutting the shuffle-crossbar between maps and reduces (m * r) leads to perfomant applications since: # It cuts down the number of connections necessary to shuffle and hence reduces load on the serving-side

[jira] Commented: (HADOOP-5737) UGI checks in testcases are broken

2009-05-11 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708336#action_12708336 ] Arun C Murthy commented on HADOOP-5737: --- +1 > UGI checks in testcases are

[jira] Commented: (HADOOP-5795) Add a bulk FIleSystem.getFileBlockLocations

2009-05-11 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708132#action_12708132 ] Arun C Murthy commented on HADOOP-5795: --- Dhruba, I was thinking it was implic

[jira] Commented: (HADOOP-4874) Remove bindings to lzo

2009-05-08 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-4874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707564#action_12707564 ] Arun C Murthy commented on HADOOP-4874: --- Tatu - please open a new jira for fa

[jira] Commented: (HADOOP-5795) Add a bulk FIleSystem.getFileBlockLocations

2009-05-08 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707446#action_12707446 ] Arun C Murthy commented on HADOOP-5795: --- bq. Map listBlockLocations(Path[]);

[jira] Resolved: (HADOOP-5790) Allow shuffle read and connection timeouts to be configurable

2009-05-08 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy resolved HADOOP-5790. --- Resolution: Duplicate Assignee: (was: Arun C Murthy) Duplicate of HADOOP-5791

[jira] Created: (HADOOP-5795) Add a bulk FIleSystem.getFileBlockLocations

2009-05-08 Thread Arun C Murthy (JIRA)
Affects Versions: 0.20.0 Reporter: Arun C Murthy Fix For: 0.21.0 Currently map-reduce applications (specifically file-based input-formats) use FileSystem.getFileBlockLocations to compute splits. However they are forced to call it once per file. The downsides are multiple

[jira] Created: (HADOOP-5790) Allow shuffle read and connection timeouts to be configurable

2009-05-08 Thread Arun C Murthy (JIRA)
Components: mapred Affects Versions: 0.20.0 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.21.0 It would be good for latency-sensitive applications to tune the shuffle read/connection timeouts... -- This message is automatically

[jira] Created: (HADOOP-5789) Allow shuffle read and connection timeouts to be configurable

2009-05-08 Thread Arun C Murthy (JIRA)
Components: mapred Affects Versions: 0.20.0 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.21.0 It would be good for latency-sensitive applications to tune the shuffle read/connection timeouts... in fact this made a huge difference

[jira] Created: (HADOOP-5788) Improvements to RPC between Child and TaskTracker

2009-05-08 Thread Arun C Murthy (JIRA)
: mapred Affects Versions: 0.20.0 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.21.0 We could improve the RPC between the Child and TaskTracker: * Set ping interval lower by default to 5s * Disable nagle's algorithm (tcp no-delay) --

[jira] Created: (HADOOP-5787) Allow HADOOP_ROOT_LOGGER to be configured via conf/hadoop-env.sh

2009-05-08 Thread Arun C Murthy (JIRA)
: Improvement Reporter: Arun C Murthy Assignee: Arun C Murthy Currently it's set in bin/hadoop-daemon.sh... we should allow it to be specified in conf/hadoop-env.sh -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the

[jira] Updated: (HADOOP-5787) Allow HADOOP_ROOT_LOGGER to be configured via conf/hadoop-env.sh

2009-05-08 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5787: -- Component/s: scripts Affects Version/s: 0.20.0 Fix Version/s: 0.21.0

[jira] Commented: (HADOOP-5737) UGI checks in testcases are broken

2009-05-07 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707126#action_12707126 ] Arun C Murthy commented on HADOOP-5737: --- bq. For case #1, JobTracker.getFileSy

[jira] Commented: (HADOOP-5737) UGI checks in testcases are broken

2009-05-06 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706516#action_12706516 ] Arun C Murthy commented on HADOOP-5737: --- The patch looks fine. I'd rath

[jira] Issue Comment Edited: (HADOOP-5670) Hadoop configurations should be read from a distributed system

2009-05-05 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706194#action_12706194 ] Arun C Murthy edited comment on HADOOP-5670 at 5/5/09 2:0

[jira] Commented: (HADOOP-5772) Implement a 'refreshable' configuration system with right access-controls etc.

2009-05-05 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706195#action_12706195 ] Arun C Murthy commented on HADOOP-5772: --- Sure, alternatively we could use this

[jira] Commented: (HADOOP-5670) Hadoop configurations should be read from a distributed system

2009-05-05 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706194#action_12706194 ] Arun C Murthy commented on HADOOP-5670: --- One important and related feature are

[jira] Commented: (HADOOP-5643) Ability to blacklist tasktracker

2009-05-05 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706135#action_12706135 ] Arun C Murthy commented on HADOOP-5643: --- bq.I think we are going throug

[jira] Created: (HADOOP-5772) Implement a 'refreshable' configuration system with right access-controls etc.

2009-05-05 Thread Arun C Murthy (JIRA)
adoop Core Issue Type: New Feature Components: conf Reporter: Arun C Murthy We have various bits and pieces of code to refresh certain configuration files, various components to restrict access to who can actually refresh the configs etc. I propose we start thin

[jira] Commented: (HADOOP-5768) add new options to hadoop job -list-attempt-ids to dump counters and diagnostic messages

2009-05-04 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705663#action_12705663 ] Arun C Murthy commented on HADOOP-5768: --- +1 We should also expand the suppo

Re: Decoupling Intertracker protocol

2009-04-24 Thread Arun C Murthy
On Apr 22, 2009, at 7:17 PM, Samprita Hegde wrote: Hi, I am trying see the feasibility of using shared spaces for the communication of the Task Completion Events in Hadoop. For this I am trying to replace the InterTracker Protocol with a co-ordination space so that one thread in a Task Tr

[jira] Commented: (HADOOP-5737) UGI checks in testcases are broken

2009-04-24 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702309#action_12702309 ] Arun C Murthy commented on HADOOP-5737: --- Also, another thing to check is whethe

[jira] Commented: (HADOOP-5737) UGI checks in testcases are broken

2009-04-24 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702306#action_12702306 ] Arun C Murthy commented on HADOOP-5737: --- Does this anomaly happen in all test-c

[jira] Commented: (HADOOP-5726) Remove pre-emption from the capacity scheduler code base

2009-04-24 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702304#action_12702304 ] Arun C Murthy commented on HADOOP-5726: --- +1 for the direction. I'd propos

[jira] Commented: (HADOOP-5725) TaskTracker shuold run user tasks nicely in the local machine

2009-04-23 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702254#action_12702254 ] Arun C Murthy commented on HADOOP-5725: --- bq. I had jobs with tasks expected to

[jira] Commented: (HADOOP-5632) Jobtracker leaves tasktrackers underutilized

2009-04-23 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702039#action_12702039 ] Arun C Murthy commented on HADOOP-5632: --- Well, we need to consider the 'p

[jira] Commented: (HADOOP-5632) Jobtracker leaves tasktrackers underutilized

2009-04-22 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12701550#action_12701550 ] Arun C Murthy commented on HADOOP-5632: --- bq. If running tasks status updates ca

[jira] Commented: (HADOOP-5632) Jobtracker leaves tasktrackers underutilized

2009-04-22 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12701543#action_12701543 ] Arun C Murthy commented on HADOOP-5632: --- bq. ?? The sequence number seems str

[jira] Updated: (HADOOP-5684) separate jvm param for mapper and reducer

2009-04-20 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5684: -- Attachment: HADOOP-5684_0_20090420.patch I've been running for a while with this

[jira] Assigned: (HADOOP-5684) separate jvm param for mapper and reducer

2009-04-20 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy reassigned HADOOP-5684: - Assignee: Arun C Murthy > separate jvm param for mapper and redu

[jira] Resolved: (HADOOP-5639) Remove cyclic calls between JobTracker, JobInProgress and TaskInProgress

2009-04-08 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy resolved HADOOP-5639. --- Resolution: Duplicate Duplicate of HADOOP-869. > Remove cyclic calls between JobTrac

[jira] Commented: (HADOOP-5572) The map progress value should have a separate phase for doing the final sort.

2009-03-25 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689125#action_12689125 ] Arun C Murthy commented on HADOOP-5572: --- +1 > The map progress value should

[jira] Commented: (HADOOP-5531) Disable Chukwa unit tests on branch-0.20

2009-03-19 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683513#action_12683513 ] Arun C Murthy commented on HADOOP-5531: --- +1 > Disable Chukwa unit tests on

[jira] Commented: (HADOOP-5514) Add waiting/failed tasks to JobTracker metrics

2009-03-17 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12682882#action_12682882 ] Arun C Murthy commented on HADOOP-5514: --- +1. Eventually we should rep

[jira] Commented: (HADOOP-5382) The new map/reduce api doesn't support combiners

2009-03-17 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12682852#action_12682852 ] Arun C Murthy commented on HADOOP-5382: --- +1 > The new map/reduce api

[jira] Commented: (HADOOP-5281) GzipCodec fails second time it is used in a process

2009-03-17 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12682745#action_12682745 ] Arun C Murthy commented on HADOOP-5281: --- +1 > GzipCodec fails second tim

[jira] Resolved: (HADOOP-5442) The job history display needs to be paged

2009-03-16 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy resolved HADOOP-5442. --- Resolution: Fixed Fix Version/s: 0.21.0 I just committed this. Thanks, Amar! >

[jira] Commented: (HADOOP-5459) CRC errors not detected reading intermediate output into memory with problematic length

2009-03-13 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681948#action_12681948 ] Arun C Murthy commented on HADOOP-5459: --- +1 > CRC errors not detected

[jira] Commented: (HADOOP-4996) JobControl does not report killed jobs

2009-02-27 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677598#action_12677598 ] Arun C Murthy commented on HADOOP-4996: --- Amareshwari, if I remember right -

[jira] Commented: (HADOOP-4744) Wrong resolution of hostname and port

2009-02-22 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675763#action_12675763 ] Arun C Murthy commented on HADOOP-4744: --- +1 > Wrong resolution of hostn

[jira] Commented: (HADOOP-5299) Reducer inputs should be spilled to HDFS rather than local disk.

2009-02-20 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675440#action_12675440 ] Arun C Murthy commented on HADOOP-5299: --- Caveat emptor: # Typical mid-to-l

Re: [VOTE] Release Hadoop 0.19.1 (candidate 0)

2009-02-19 Thread Arun C Murthy
On Feb 19, 2009, at 4:28 PM, Nigel Daley wrote: I have created a candidate build for Hadoop 0.19.1. This release fixes some data loss issues in 0.19.0 and introduces an incompatible change by disabling the file append API (HADOOP-5224). As you can see in Jira, there are still a couple mapr

Re: Integration with SGE

2009-02-18 Thread Arun C Murthy
On Feb 18, 2009, at 10:37 AM, Amin Astaneh wrote: Lukáš- Well, we have a graduate student that is using our facilities for a Masters' thesis in Map/Reduce. You guys are generating topics in computer science research. What do we need to do in order to get our documentation on the Hadoop

[jira] Commented: (HADOOP-4490) Map and Reduce tasks should run as the user who submitted the job

2009-02-17 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-4490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674504#action_12674504 ] Arun C Murthy commented on HADOOP-4490: --- bq. If it is felt that this should be

[jira] Updated: (HADOOP-4490) Map and Reduce tasks should run as the user who submitted the job

2009-02-17 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-4490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-4490: -- Status: Open (was: Patch Available) Some comments after a discussion with Hemanth: # We

[jira] Resolved: (HADOOP-5244) Distributed cache spends a lot of time runing du -s

2009-02-13 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy resolved HADOOP-5244. --- Resolution: Duplicate Duplicate of HADOOP-4780. > Distributed cache spends a lot of t

[jira] Commented: (HADOOP-5223) Refactor reduce shuffle code

2009-02-12 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672923#action_12672923 ] Arun C Murthy commented on HADOOP-5223: --- Forgot to add that we felt it's

[jira] Updated: (HADOOP-5223) Refactor reduce shuffle code

2009-02-12 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5223: -- Attachment: HADOOP-5233_part0.patch Another puzzle piece, includes a completed Fetcher

[jira] Updated: (HADOOP-5223) Refactor reduce shuffle code

2009-02-11 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5223: -- Attachment: HADOOP-5233_api.patch Here is a the rough draft of the api we've been worki

[jira] Commented: (HADOOP-5160) Hadoop reduce scheduler sometimes leaves machines idle

2009-02-06 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671343#action_12671343 ] Arun C Murthy commented on HADOOP-5160: --- Were other jobs 'alive' in t

[jira] Commented: (HADOOP-5160) Hadoop reduce scheduler sometimes leaves machines idle

2009-02-06 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671054#action_12671054 ] Arun C Murthy commented on HADOOP-5160: --- As of hadoop-0.18 the Map-Reduce sched

Re: how to setup lzop in hadoop

2009-02-05 Thread Arun C Murthy
On Feb 5, 2009, at 1:13 AM, zhangwei wrote: Hi,all I run a job with lzop enabled (mapred.output.compress=true and mapred .output.compression.codec=org.apache.hadoop.io.compress.LzopCodec), but it always fails with these logs,seems it can’t load native-lzo library, but I don’t know how to set

[jira] Updated: (HADOOP-5138) Current Chukwa Trunk failed contrib unit tests.

2009-02-02 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5138: -- Resolution: Fixed Fix Version/s: 0.21.0 Status: Resolved (was: Patch

[jira] Commented: (HADOOP-4033) Task tracker should not ask for tasks if its temp disk space is almost full

2009-01-30 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-4033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668964#action_12668964 ] Arun C Murthy commented on HADOOP-4033: --- bq. Actually, I'd propose that we

[jira] Created: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

2009-01-29 Thread Arun C Murthy (JIRA)
: 0.20.0 Reporter: Arun C Murthy Priority: Blocker Fix For: 0.20.0 For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows: {noformat} 2009-01-30 06:43:32,312

[jira] Commented: (HADOOP-5131) ReduceCopier sleeps for a hardcoded interval of 5secs

2009-01-29 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668387#action_12668387 ] Arun C Murthy commented on HADOOP-5131: --- Good catch Devaraj! I actually saw the

Re: [VOTE] Release Hadoop 0.18.3 (candidate 0)

2009-01-28 Thread Arun C Murthy
On Jan 22, 2009, at 10:46 PM, Nigel Daley wrote: I have created a candidate build for Hadoop 0.18.3. This fixes 51 issues in 0.18.2, including several problems that may lead to data loss from the file system. *** Please download and test it before the *** vote closes on Tuesday, January 2

[jira] Updated: (HADOOP-5131) ReduceCopier sleeps for a hardcoded interval of 5secs

2009-01-27 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5131: -- Attachment: HADOOP-5131_0_20090127.patch Preliminary patch while I continue testing

[jira] Commented: (HADOOP-5130) TaskTracker seems to hold onto the assigned task for a long while before launching it

2009-01-27 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667631#action_12667631 ] Arun C Murthy commented on HADOOP-5130: --- To clarify - the task did eventu

[jira] Updated: (HADOOP-5130) TaskTracker seems to hold onto the assigned task for a long while before launching it

2009-01-27 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5130: -- Priority: Blocker (was: Critical) Fix Version/s: 0.20.0 Promoting this to a blocker

[jira] Created: (HADOOP-5132) Allow MAX_XCEIVER_COUNT to be configured

2009-01-27 Thread Arun C Murthy (JIRA)
Versions: 0.20.0 Reporter: Arun C Murthy Fix For: 0.21.0 Currently DataXcieverServer.java hardcodes MAX_XCEIVER_COUNT to 256. I ran a randomwriter with 4k maps and output replication set to 40 - all write pipelines failed since each DataNode hit the above limit and refused

[jira] Created: (HADOOP-5131) ReduceCopier sleeps for a hardcoded interval of 5secs

2009-01-27 Thread Arun C Murthy (JIRA)
: mapred Affects Versions: 0.20.0 Reporter: Arun C Murthy Assignee: Arun C Murthy Currently the ReduceCopier.run has a hard-coded 5s sleep which hurts jobs where latency is important, we really should have a mechanism where the thread fetching task-completion events

[jira] Created: (HADOOP-5130) TaskTracker seems to hold onto the assigned task for a long while before launching it

2009-01-27 Thread Arun C Murthy (JIRA)
Project: Hadoop Core Issue Type: Bug Components: mapred Affects Versions: 0.20.0 Reporter: Arun C Murthy Priority: Critical I saw atleast a couple of instances where the task assigned to the TaskTracker is launched several minutes after the receipt of

[jira] Updated: (HADOOP-5128) JobQueueTaskScheduler could assign multiple reduces per heartbeat

2009-01-27 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5128: -- Summary: JobQueueTaskScheduler could assign multiple reduces per heartbeat (was

[jira] Created: (HADOOP-5129) TaskTracker could send an out-of-band heartbeat when the last running map/reduce completes

2009-01-27 Thread Arun C Murthy (JIRA)
Project: Hadoop Core Issue Type: Improvement Components: mapred Affects Versions: 0.20.0 Reporter: Arun C Murthy Assignee: Arun C Murthy Currently the TaskTracker strictly respects the heartbeat interval, this causes utilization issues when all

[jira] Updated: (HADOOP-5128) JobQueueTaskScheduler should Assign multiple reduces per heartbeat

2009-01-27 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-5128: -- Component/s: mapred Assignee: Arun C Murthy > JobQueueTaskScheduler should Ass

[jira] Created: (HADOOP-5128) JobQueueTaskScheduler should Assign multiple reduces per heartbeat

2009-01-27 Thread Arun C Murthy (JIRA)
: Improvement Affects Versions: 0.21.0 Reporter: Arun C Murthy Currently the JobQueueTaskScheduler assigns only 1 reduce per heartbeat, for applications where latency is important we could assign more reduces (upto available slots). -- This message is automatically generated by

[jira] Commented: (HADOOP-4667) Global scheduling in the Fair Scheduler

2009-01-23 Thread Arun C Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666446#action_12666446 ] Arun C Murthy commented on HADOOP-4667: --- bq. The JobInProgress class shoul

  1   2   3   4   5   6   7   8   9   >