[jira] Created: (MAPREDUCE-1233) Incorrect Waiting maps/reduces in Jobtracker metrics

2009-11-23 Thread V.Karthikeyan (JIRA)
Incorrect Waiting maps/reduces in Jobtracker metrics 
-

 Key: MAPREDUCE-1233
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1233
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: V.Karthikeyan


Waiting Maps/Reduces are incorrect in Jobtracker metrics when a job fails. when 
a map/reduce fails(during job failure), waiting maps/reduce got incremented and 
doesn't get decremented even after job cleanup.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1140) Per cache-file refcount can become negative when tasks release distributed-cache files

2009-11-19 Thread V.Karthikeyan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779954#action_12779954
 ] 

V.Karthikeyan commented on MAPREDUCE-1140:
--

Test Scenario:

Configure local.cache.size to 4GB.

Ran Job1 with cache files file1 and file2 - Job succeeded.
Ran Job2 with cache files file3 and file1. When file3 is getting localized, 
removed file3 from dfs - Job2 failed.
Ran Job3 with cache files file1, file1(again) and file4. file4 is huge (say 
5GB), larger than local.cache.size.
Without the patch, but the bug could be reproduced and verified by validating 
the logs(file1,file2,file3 got deleted if the cache is full).

Test Scenario: (With the patch)

Configure local.cache.size to 4GB.

Ran Job1 with cache files file1 and file2 - Job succeeded.
Ran Job2 with cache files file3 and file1. When file3 is getting localized, 
removed file3 from dfs - Job2 failed.
Ran Job3 with cache files file1, file1(again) and file4. file4 is huge (say 
5GB), larger than local.cache.size.
 Job3 succeeded, and it is also validated using the logs (file2 and file3 got 
deleted while, file1 never got deleted even the cache is not enough to localize 
file4. ).

> Per cache-file refcount can become negative when tasks release 
> distributed-cache files
> --
>
> Key: MAPREDUCE-1140
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1140
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.20.2, 0.21.0, 0.22.0
>Reporter: Vinod K V
>Assignee: Amareshwari Sriramadasu
> Attachments: patch-1140-1.txt, patch-1140-2.txt, 
> patch-1140-ydist.txt, patch-1140-ydist.txt, patch-1140.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1219) JobTracker Metrics causes undue load on JobTracker

2009-11-19 Thread V.Karthikeyan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779922#action_12779922
 ] 

V.Karthikeyan commented on MAPREDUCE-1219:
--

Verified the Job metrics using FileContext property enabled. 
Ran jobs to verify the counters in sync with the
Jobtracker UI and log file generated using FileContext enabled.


> JobTracker Metrics causes undue load on JobTracker
> --
>
> Key: MAPREDUCE-1219
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1219
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Jothi Padmanabhan
>Assignee: Amareshwari Sriramadasu
> Attachments: MAPREDUCE-1219.patch, patch-1219-ydist.txt
>
>
> JobTrackerMetricsInst.doUpdates updates job-level counters of all running 
> jobs into JobTracker's metrics causing very bad performance and hampers 
> heartbeats. Since Job level metrics are better served by JobHistory, it may 
> be a good idea to remove these from the metrics framework.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.