[jira] [Commented] (MAPREDUCE-2863) Support web-services for RM & NM

2011-11-12 Thread Thomas Graves (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149178#comment-13149178
 ] 

Thomas Graves commented on MAPREDUCE-2863:
--

Hey Hitesh, 

thanks for the feedback. we could easily change the json to match or be closer. 
Right now its configured for POJO output format.  We have a few options: 
http://jersey.java.net/nonav/apidocs/latest/jersey/com/sun/jersey/api/json/JSONConfiguration.Notation.html

Any input on which one people prefer?  The default is mapped when I turn off 
POJO and output is like:

http://virt09-pv1.tgraves.pool.corp.sp2.yahoo.com:8088/ws/v1/cluster/apps
{
   "app" : {
  "finalStatus" : "UNDEFINED",
  "finishedTime" : "0",
  "progress" : "0.0",
  "name" : "word count",
  "startedTime" : "1321112670525",
  "amContainerLogs" : 
"http://host:/node/containerlogs/container_1321112633248_0001_01_01";,
  "elapsedTime" : "8681",
  "note" : "", 
  "trackingUI" : "UNASSIGNED",
  "state" : "ACCEPTED",
  "appId" : "application_1321112633248_0001",
  "trackingUrl" : "UNASSIGNED",
  "user" : "tgraves",
  "queue" : "default",
  "clusterId" : "1321112633248"
   }
}
 


  
application_1321112633248_0001
tgraves
word count
default
ACCEPTED
0.0
UNASSIGNED
UNASSIGNED
UNDEFINED
1321112670525
08717

http://host:/node/containerlogs/container_1321112633248_0001_01_01
1321112633248
  



> Support web-services for RM & NM
> 
>
> Key: MAPREDUCE-2863
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2863
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2, nodemanager, resourcemanager
>Reporter: Arun C Murthy
>Assignee: Thomas Graves
> Attachments: MAPREDUCE-2863.patch, nmoutput.txt, rmoutput.txt
>
>
> It will be very useful for RM and NM to support web-services to export 
> json/xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-3395) Add mapred.disk.healthChecker.interval to mapred-default.xml

2011-11-12 Thread Harsh J (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved MAPREDUCE-3395.


Resolution: Fixed

> Add mapred.disk.healthChecker.interval to mapred-default.xml
> 
>
> Key: MAPREDUCE-3395
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3395
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.20.205.0
>Reporter: Eli Collins
>Assignee: Eli Collins
>Priority: Trivial
> Fix For: 0.20.206.0
>
> Attachments: mapreduce-3395-1.patch, mapreduce-3395-2.patch
>
>
> Let's add mapred.disk.healthChecker.interval to mapred-default.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3395) Add mapred.disk.healthChecker.interval to mapred-default.xml

2011-11-12 Thread Harsh J (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated MAPREDUCE-3395:
---

Fix Version/s: 0.20.206.0

Committing to 0.20.206. Thanks Eli!

> Add mapred.disk.healthChecker.interval to mapred-default.xml
> 
>
> Key: MAPREDUCE-3395
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3395
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.20.205.0
>Reporter: Eli Collins
>Assignee: Eli Collins
>Priority: Trivial
> Fix For: 0.20.206.0
>
> Attachments: mapreduce-3395-1.patch, mapreduce-3395-2.patch
>
>
> Let's add mapred.disk.healthChecker.interval to mapred-default.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3395) Add mapred.disk.healthChecker.interval to mapred-default.xml

2011-11-12 Thread Harsh J (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated MAPREDUCE-3395:
---

Attachment: mapreduce-3395-2.patch

Nit-corrected patch. Committing.

> Add mapred.disk.healthChecker.interval to mapred-default.xml
> 
>
> Key: MAPREDUCE-3395
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3395
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.20.205.0
>Reporter: Eli Collins
>Assignee: Eli Collins
>Priority: Trivial
> Attachments: mapreduce-3395-1.patch, mapreduce-3395-2.patch
>
>
> Let's add mapred.disk.healthChecker.interval to mapred-default.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3395) Add mapred.disk.healthChecker.interval to mapred-default.xml

2011-11-12 Thread Harsh J (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated MAPREDUCE-3395:
---

Hadoop Flags: Reviewed

> Add mapred.disk.healthChecker.interval to mapred-default.xml
> 
>
> Key: MAPREDUCE-3395
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3395
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.20.205.0
>Reporter: Eli Collins
>Assignee: Eli Collins
>Priority: Trivial
> Attachments: mapreduce-3395-1.patch, mapreduce-3395-2.patch
>
>
> Let's add mapred.disk.healthChecker.interval to mapred-default.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3393) TestMRJobs, TestMROldApiJobs, and TestUberAM failures

2011-11-12 Thread Thomas Graves (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-3393:
-

Attachment: org.apache.hadoop.mapreduce.v2.TestMRJobs-output.txt

> TestMRJobs, TestMROldApiJobs, and TestUberAM failures
> -
>
> Key: MAPREDUCE-3393
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3393
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Thomas Graves
>Assignee: Hitesh Shah
> Attachments: MR-3393.1.patch, MR-3393.2.patch, 
> org.apache.hadoop.mapreduce.v2.TestMRJobs-output.txt
>
>
> Check out branch 0.23 and run mvn test from hadoop-mapreduce-project directory
> ---
>  T E S T S
> ---
> Running org.apache.hadoop.mapred.TestClientServiceDelegate
> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.717 sec
> Running org.apache.hadoop.mapred.TestClientRedirect
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.436 sec
> Running org.apache.hadoop.mapreduce.TestYarnClientProtocolProvider
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.975 sec
> Running org.apache.hadoop.mapreduce.v2.TestMRJobs
> Tests run: 4, Failures: 3, Errors: 1, Skipped: 0, Time elapsed: 67.999 sec 
> <<< FAILURE!
> Running org.apache.hadoop.mapreduce.v2.TestYARNRunner
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.976 sec
> Running org.apache.hadoop.mapreduce.v2.TestMROldApiJobs
> Tests run: 2, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 31.879 sec 
> <<< FAILURE!
> Running org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService
> ^NRunning org.apache.hadoop.mapreduce.v2.TestUberAM
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 101.096 sec 
> <<< FAILURE!
> Results :
> Failed tests:   testSleepJob(org.apache.hadoop.mapreduce.v2.TestMRJobs)
>   testRandomWriter(org.apache.hadoop.mapreduce.v2.TestMRJobs)
>   testDistributedCache(org.apache.hadoop.mapreduce.v2.TestMRJobs)
>   testJobSucceed(org.apache.hadoop.mapreduce.v2.TestMROldApiJobs): Job 
> expected to succeed failed
>   testJobFail(org.apache.hadoop.mapreduce.v2.TestMROldApiJobs)
> Tests in error: 
>   testFailingMapper(org.apache.hadoop.mapreduce.v2.TestMRJobs): 0
>   org.apache.hadoop.mapreduce.v2.TestUberAM: Failed to Start 
> org.apache.hadoop.mapreduce.v2.TestMRJobs
> Tests run: 19, Failures: 5, Errors: 2, Skipped: 0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3393) TestMRJobs, TestMROldApiJobs, and TestUberAM failures

2011-11-12 Thread Thomas Graves (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149075#comment-13149075
 ] 

Thomas Graves commented on MAPREDUCE-3393:
--

TestMROldApiJobs and TestUberAM both fail with exception below.  So perhaps 
something isn't being shut down cleanly in a test before it or in the failure 
of TestMRJobs.  If you can't reproduce it let me know and I'll look at the 
failures.

2011-11-11 17:20:44,452 ERROR [Thread-4] service.CompositeService 
(CompositeService.java:start(72)) - Error starting services ResourceManager
org.apache.hadoop.yarn.YarnException: java.net.BindException: Problem binding 
to [0.0.0.0:8025] java.net.BindException: Address already in use; For more 
details see:  http://wiki.apache.org/hadoop/BindException
at 
org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:125)
at 
org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:63)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.start(ResourceTrackerService.java:125)
at 
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.start(ResourceManager.java:439)
at 
org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper$2.run(MiniYARNCluster.java:126)
Caused by: java.net.BindException: Problem binding to [0.0.0.0:8025] 
java.net.BindException: Address already in use; For more details see:  
http://wiki.apache.org/hadoop/BindException
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:606)
at org.apache.hadoop.ipc.Server.bind(Server.java:230)
at org.apache.hadoop.ipc.Server$Listener.(Server.java:310)
at org.apache.hadoop.ipc.Server.(Server.java:1591)
at org.apache.hadoop.ipc.RPC$Server.(RPC.java:576)
at 
org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Server.(ProtoOverHadoopRpcEngine.java:314)
at 
org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine.getServer(ProtoOverHadoopRpcEngine.java:390)
at org.apache.hadoop.ipc.RPC.getServer(RPC.java:550)
at 
org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.createServer(RpcServerFactoryPBImpl.java:155)
at 
org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:118)
... 5 more

> TestMRJobs, TestMROldApiJobs, and TestUberAM failures
> -
>
> Key: MAPREDUCE-3393
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3393
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Thomas Graves
>Assignee: Hitesh Shah
> Attachments: MR-3393.1.patch, MR-3393.2.patch
>
>
> Check out branch 0.23 and run mvn test from hadoop-mapreduce-project directory
> ---
>  T E S T S
> ---
> Running org.apache.hadoop.mapred.TestClientServiceDelegate
> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.717 sec
> Running org.apache.hadoop.mapred.TestClientRedirect
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.436 sec
> Running org.apache.hadoop.mapreduce.TestYarnClientProtocolProvider
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.975 sec
> Running org.apache.hadoop.mapreduce.v2.TestMRJobs
> Tests run: 4, Failures: 3, Errors: 1, Skipped: 0, Time elapsed: 67.999 sec 
> <<< FAILURE!
> Running org.apache.hadoop.mapreduce.v2.TestYARNRunner
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.976 sec
> Running org.apache.hadoop.mapreduce.v2.TestMROldApiJobs
> Tests run: 2, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 31.879 sec 
> <<< FAILURE!
> Running org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService
> ^NRunning org.apache.hadoop.mapreduce.v2.TestUberAM
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 101.096 sec 
> <<< FAILURE!
> Results :
> Failed tests:   testSleepJob(org.apache.hadoop.mapreduce.v2.TestMRJobs)
>   testRandomWriter(org.apache.hadoop.mapreduce.v2.TestMRJobs)
>   testDistributedCache(org.apache.hadoop.mapreduce.v2.TestMRJobs)
>   testJobSucceed(org.apache.hadoop.mapreduce.v2.TestMROldApiJobs): Job 
> expected to succeed failed
>   testJobFail(org.apache.hadoop.mapreduce.v2.TestMROldApiJobs)
> Tests in error: 
>   testFailingMapper(org.apache.hadoop.mapreduce.v2.TestMRJobs): 0
>   org.apache.hadoop.mapreduce.v2.TestUberAM: Failed to Start 
> org.apache.hadoop.mapreduce.v2.TestMRJobs
> Tests run: 19, Failures: 5, Errors: 2, Skipped: 0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact

[jira] [Commented] (MAPREDUCE-3393) TestMRJobs, TestMROldApiJobs, and TestUberAM failures

2011-11-12 Thread Thomas Graves (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149072#comment-13149072
 ] 

Thomas Graves commented on MAPREDUCE-3393:
--

yes JAVA_HOME is set.  Did you run them individually or all the tests?  Sorry I 
should have said this originally - they only fail when run them all together. I 
could not get them to fail when run individually. I'll attach logs.

> TestMRJobs, TestMROldApiJobs, and TestUberAM failures
> -
>
> Key: MAPREDUCE-3393
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3393
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Thomas Graves
>Assignee: Hitesh Shah
> Attachments: MR-3393.1.patch, MR-3393.2.patch
>
>
> Check out branch 0.23 and run mvn test from hadoop-mapreduce-project directory
> ---
>  T E S T S
> ---
> Running org.apache.hadoop.mapred.TestClientServiceDelegate
> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.717 sec
> Running org.apache.hadoop.mapred.TestClientRedirect
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.436 sec
> Running org.apache.hadoop.mapreduce.TestYarnClientProtocolProvider
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.975 sec
> Running org.apache.hadoop.mapreduce.v2.TestMRJobs
> Tests run: 4, Failures: 3, Errors: 1, Skipped: 0, Time elapsed: 67.999 sec 
> <<< FAILURE!
> Running org.apache.hadoop.mapreduce.v2.TestYARNRunner
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.976 sec
> Running org.apache.hadoop.mapreduce.v2.TestMROldApiJobs
> Tests run: 2, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 31.879 sec 
> <<< FAILURE!
> Running org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService
> ^NRunning org.apache.hadoop.mapreduce.v2.TestUberAM
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 101.096 sec 
> <<< FAILURE!
> Results :
> Failed tests:   testSleepJob(org.apache.hadoop.mapreduce.v2.TestMRJobs)
>   testRandomWriter(org.apache.hadoop.mapreduce.v2.TestMRJobs)
>   testDistributedCache(org.apache.hadoop.mapreduce.v2.TestMRJobs)
>   testJobSucceed(org.apache.hadoop.mapreduce.v2.TestMROldApiJobs): Job 
> expected to succeed failed
>   testJobFail(org.apache.hadoop.mapreduce.v2.TestMROldApiJobs)
> Tests in error: 
>   testFailingMapper(org.apache.hadoop.mapreduce.v2.TestMRJobs): 0
>   org.apache.hadoop.mapreduce.v2.TestUberAM: Failed to Start 
> org.apache.hadoop.mapreduce.v2.TestMRJobs
> Tests run: 19, Failures: 5, Errors: 2, Skipped: 0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3343) TaskTracker Out of Memory because of distributed cache

2011-11-12 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149009#comment-13149009
 ] 

Hadoop QA commented on MAPREDUCE-3343:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12503477/MAPREDUCE-3343_rev2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1297//console

This message is automatically generated.

> TaskTracker Out of Memory because of distributed cache
> --
>
> Key: MAPREDUCE-3343
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3343
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: 0.20.205.0
>Reporter: Ahmed Radwan
>Assignee: zhaoyunjiong
>  Labels: mapreduce, patch
> Attachments: MAPREDUCE-3343_rev2.patch, 
> mapreduce-3343-release-0.20.205.0.patch
>
>
> This Out of Memory happens when you run large number of jobs (using the 
> distributed cache) on a TaskTracker. 
> Seems the basic issue is with the distributedCacheManager (instance of 
> TrackerDistributedCacheManager in TaskTracker.java), this gets created during 
> TaskTracker.initialize(), and it keeps references to 
> TaskDistributedCacheManager for every submitted job via the jobArchives Map, 
> also references to CacheStatus via cachedArchives map. I am not seeing these 
> cleaned up between jobs, so this can out of memory problems after really 
> large number of jobs are submitted. We have seen this issue in a number of 
> cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3343) TaskTracker Out of Memory because of distributed cache

2011-11-12 Thread Ahmed Radwan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-3343:


Attachment: MAPREDUCE-3343_rev2.patch

Here is zhaoyunjiong's patch incorporating Eli's additional comments.

> TaskTracker Out of Memory because of distributed cache
> --
>
> Key: MAPREDUCE-3343
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3343
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: 0.20.205.0
>Reporter: Ahmed Radwan
>Assignee: zhaoyunjiong
>  Labels: mapreduce, patch
> Attachments: MAPREDUCE-3343_rev2.patch, 
> mapreduce-3343-release-0.20.205.0.patch
>
>
> This Out of Memory happens when you run large number of jobs (using the 
> distributed cache) on a TaskTracker. 
> Seems the basic issue is with the distributedCacheManager (instance of 
> TrackerDistributedCacheManager in TaskTracker.java), this gets created during 
> TaskTracker.initialize(), and it keeps references to 
> TaskDistributedCacheManager for every submitted job via the jobArchives Map, 
> also references to CacheStatus via cachedArchives map. I am not seeing these 
> cleaned up between jobs, so this can out of memory problems after really 
> large number of jobs are submitted. We have seen this issue in a number of 
> cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira