[jira] [Commented] (MAPREDUCE-5614) job history file name should escape queue name

2013-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816981#comment-13816981
 ] 

Hadoop QA commented on MAPREDUCE-5614:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12612775/mr-5614.diff
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4185//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4185//console

This message is automatically generated.

> job history file name should escape queue name
> --
>
> Key: MAPREDUCE-5614
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5614
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Liyin Liang
>Assignee: Liyin Liang
> Attachments: mr-5614.diff
>
>
> Our cluster's queue name contains hyphen e.g. cug-taobao. Because hyphen is 
> the delimiter of job history file name, JobHistoryServer shows "cug" as the 
> queue name. To fix this problem, we should escape queuename in job history 
> file name.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5614) job history file name should escape queue name

2013-11-07 Thread Liyin Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Liang updated MAPREDUCE-5614:
---

Attachment: mr-5614.diff

attach a patch to escape queue name.

> job history file name should escape queue name
> --
>
> Key: MAPREDUCE-5614
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5614
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Liyin Liang
>Assignee: Liyin Liang
> Attachments: mr-5614.diff
>
>
> Our cluster's queue name contains hyphen e.g. cug-taobao. Because hyphen is 
> the delimiter of job history file name, JobHistoryServer shows "cug" as the 
> queue name. To fix this problem, we should escape queuename in job history 
> file name.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5614) job history file name should escape queue name

2013-11-07 Thread Liyin Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Liang updated MAPREDUCE-5614:
---

Status: Patch Available  (was: Open)

> job history file name should escape queue name
> --
>
> Key: MAPREDUCE-5614
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5614
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Liyin Liang
>Assignee: Liyin Liang
> Attachments: mr-5614.diff
>
>
> Our cluster's queue name contains hyphen e.g. cug-taobao. Because hyphen is 
> the delimiter of job history file name, JobHistoryServer shows "cug" as the 
> queue name. To fix this problem, we should escape queuename in job history 
> file name.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (MAPREDUCE-5614) job history file name should escape queue name

2013-11-07 Thread Liyin Liang (JIRA)
Liyin Liang created MAPREDUCE-5614:
--

 Summary: job history file name should escape queue name
 Key: MAPREDUCE-5614
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5614
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Liyin Liang
Assignee: Liyin Liang


Our cluster's queue name contains hyphen e.g. cug-taobao. Because hyphen is the 
delimiter of job history file name, JobHistoryServer shows "cug" as the queue 
name. To fix this problem, we should escape queuename in job history file name.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5481) Uber job reducers hang waiting to shuffle map outputs

2013-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816847#comment-13816847
 ] 

Hadoop QA commented on MAPREDUCE-5481:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12612714/MAPREDUCE-5481.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:

  org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator
  org.apache.hadoop.mapred.TestJobCleanup

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4183//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4183//console

This message is automatically generated.

> Uber job reducers hang waiting to shuffle map outputs
> -
>
> Key: MAPREDUCE-5481
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5481
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Xuan Gong
>Priority: Blocker
> Attachments: MAPREDUCE-5481.patch, syslog
>
>
> TestUberAM has been timing out on trunk for some time now and surefire then 
> fails the build.  I'm not able to reproduce it locally, but the Jenkins 
> builds have been seeing it fairly consistently.  See 
> https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1529/console



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5613) DefaultSpeculator.statusUpdate wastes CPU for task attempt id hashing and lookups in an unused empty CHM

2013-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816839#comment-13816839
 ] 

Hadoop QA commented on MAPREDUCE-5613:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12612737/MAPREDUCE-5613.v01.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app:

  org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4184//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4184//console

This message is automatically generated.

> DefaultSpeculator.statusUpdate wastes CPU for task attempt id hashing and 
> lookups in an unused empty CHM
> 
>
> Key: MAPREDUCE-5613
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5613
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Reporter: Gera Shegalov
> Attachments: MAPREDUCE-5613.v01.patch
>
>
> The only way pendingSpeculations is used:
> {code}
>  // If the task is already known to be speculation-bait, don't do 
> anything   
>   if (pendingSpeculations.get(task) != null) {
> 
> if (pendingSpeculations.get(task).get()) {
> 
>   return; 
> 
> } 
> 
>   } 
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5613) DefaultSpeculator.statusUpdate wastes CPU for task attempt id hashing and lookups in an unused empty CHM

2013-11-07 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-5613:
-

Status: Patch Available  (was: Open)

> DefaultSpeculator.statusUpdate wastes CPU for task attempt id hashing and 
> lookups in an unused empty CHM
> 
>
> Key: MAPREDUCE-5613
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5613
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Reporter: Gera Shegalov
> Attachments: MAPREDUCE-5613.v01.patch
>
>
> The only way pendingSpeculations is used:
> {code}
>  // If the task is already known to be speculation-bait, don't do 
> anything   
>   if (pendingSpeculations.get(task) != null) {
> 
> if (pendingSpeculations.get(task).get()) {
> 
>   return; 
> 
> } 
> 
>   } 
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5613) DefaultSpeculator.statusUpdate wastes CPU for task attempt id hashing and lookups in an unused empty CHM

2013-11-07 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-5613:
-

Attachment: MAPREDUCE-5613.v01.patch

> DefaultSpeculator.statusUpdate wastes CPU for task attempt id hashing and 
> lookups in an unused empty CHM
> 
>
> Key: MAPREDUCE-5613
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5613
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Reporter: Gera Shegalov
> Attachments: MAPREDUCE-5613.v01.patch
>
>
> The only way pendingSpeculations is used:
> {code}
>  // If the task is already known to be speculation-bait, don't do 
> anything   
>   if (pendingSpeculations.get(task) != null) {
> 
> if (pendingSpeculations.get(task).get()) {
> 
>   return; 
> 
> } 
> 
>   } 
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (MAPREDUCE-5613) DefaultSpeculator.statusUpdate wastes CPU for task attempt id hashing and lookups in an unused empty CHM

2013-11-07 Thread Gera Shegalov (JIRA)
Gera Shegalov created MAPREDUCE-5613:


 Summary: DefaultSpeculator.statusUpdate wastes CPU for task 
attempt id hashing and lookups in an unused empty CHM
 Key: MAPREDUCE-5613
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5613
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Gera Shegalov


The only way pendingSpeculations is used:
{code}
 // If the task is already known to be speculation-bait, don't do anything  
 
  if (pendingSpeculations.get(task) != null) {  
  
if (pendingSpeculations.get(task).get()) {  
  
  return;   
  
}   
  
  } 
{code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-4710) Add peak memory usage counter for each task

2013-11-07 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816725#comment-13816725
 ] 

Ming Ma commented on MAPREDUCE-4710:


It doesn’t seem to be MR application specific, other YARN application might 
want this as well. Should it be done at NM level so that there are general 
container peak memory usage data?

> Add peak memory usage counter for each task
> ---
>
> Key: MAPREDUCE-4710
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4710
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: task
>Affects Versions: 1.0.2
>Reporter: Cindy Li
>Assignee: Cindy Li
>Priority: Minor
>  Labels: patch
> Attachments: MAPREDUCE-4710-trunk.patch, mapreduce-4710-v1.0.2.patch, 
> mapreduce-4710.patch, mapreduce4710.patch
>
>
> Each task has counters PHYSICAL_MEMORY_BYTES and VIRTUAL_MEMORY_BYTES, which 
> are snapshots of memory usage of that task. They are not sufficient for users 
> to understand peak memory usage by that task, e.g. in order to diagnose task 
> failures, tune job parameters or change application design. This new feature 
> will add two more counters for each task: PHYSICAL_MEMORY_BYTES_MAX and 
> VIRTUAL_MEMORY_BYTES_MAX. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5481) Uber job reducers hang waiting to shuffle map outputs

2013-11-07 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-5481:
--

Attachment: MAPREDUCE-5481.patch

Attached a patch that enables Uber AMs with multiple reducers.

> Uber job reducers hang waiting to shuffle map outputs
> -
>
> Key: MAPREDUCE-5481
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5481
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Xuan Gong
>Priority: Blocker
> Attachments: MAPREDUCE-5481.patch, syslog
>
>
> TestUberAM has been timing out on trunk for some time now and surefire then 
> fails the build.  I'm not able to reproduce it locally, but the Jenkins 
> builds have been seeing it fairly consistently.  See 
> https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1529/console



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5481) Uber job reducers hang waiting to shuffle map outputs

2013-11-07 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-5481:
--

Status: Patch Available  (was: Open)

> Uber job reducers hang waiting to shuffle map outputs
> -
>
> Key: MAPREDUCE-5481
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5481
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Xuan Gong
>Priority: Blocker
> Attachments: MAPREDUCE-5481.patch, syslog
>
>
> TestUberAM has been timing out on trunk for some time now and surefire then 
> fails the build.  I'm not able to reproduce it locally, but the Jenkins 
> builds have been seeing it fairly consistently.  See 
> https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1529/console



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-11-07 Thread viswanathan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816213#comment-13816213
 ] 

viswanathan commented on MAPREDUCE-5508:


Thanks a lot Xi.

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Fix For: 1-win, 1.3.0
>
> Attachments: CleanupQueue.java, JobInProgress.java, 
> MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, MAPREDUCE-5508.3.patch, 
> MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-11-07 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816210#comment-13816210
 ] 

Xi Fang commented on MAPREDUCE-5508:


One way to confirm that is to set to mapred.jobtracker.completeuserjobs.maximum 
= 0 and run some jobs. After all the jobs are done, wait for a while and check 
the number of FS objects in FileSystem#Cache.

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Fix For: 1-win, 1.3.0
>
> Attachments: CleanupQueue.java, JobInProgress.java, 
> MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, MAPREDUCE-5508.3.patch, 
> MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-11-07 Thread viswanathan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816205#comment-13816205
 ] 

viswanathan commented on MAPREDUCE-5508:


Hi Chris,

Thanks for your detailed response.

Thanks,

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Fix For: 1-win, 1.3.0
>
> Attachments: CleanupQueue.java, JobInProgress.java, 
> MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, MAPREDUCE-5508.3.patch, 
> MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-11-07 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816192#comment-13816192
 ] 

Chris Nauroth commented on MAPREDUCE-5508:
--

bq. Hope this fix is committed in branch-1, please share the revision of the 
commit.

http://svn.apache.org/viewvc?view=revision&revision=1497962
http://svn.apache.org/viewvc?view=revision&revision=1499904
http://svn.apache.org/viewvc?view=revision&revision=1525774

bq. I have noticed that heap size of Jobtracker is gradually increasing after 
the upgrade also.

Just observing heap size wouldn't be sufficient to confirm or deny that the fix 
is in place.  It's natural for the JVM to grow the heap as needed.  Incremental 
garbage collection will clean that up gradually, and a full GC eventually would 
reclaim all unused space.  All of this is too unpredictable to confirm or deny 
the memory leak.

We confirmed this fix by running various MapReduce workloads in a controlled 
environment, running jmap on the JobTracker process to dump a memory map, and 
then viewing the dump with jhat.  When the memory leak happens, you end up 
seeing {{DistributedFileSystem}} instances that are only referenced from the 
internal {{HashMap}} of the {{FileSystem#Cache}}.  (With no other reference to 
the instance, it means that no one is ever going to close it, and therefore it 
will never get removed from the cache.)  With all of these patches applied, we 
see all {{DistributedFileSystem}} instances are referenced from the 
{{FileSystem#Cache}} and also some other references.

bq. Do I need to update any other patches/fix?

No, that's all of them.

If there are any additional questions, I recommend moving to the 
u...@hadoop.apache.org mailing list.

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Fix For: 1-win, 1.3.0
>
> Attachments: CleanupQueue.java, JobInProgress.java, 
> MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, MAPREDUCE-5508.3.patch, 
> MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5612) Document TaskAttemptCompletionEventStatuses

2013-11-07 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-5612:
--

Labels: newbie  (was: )

> Document TaskAttemptCompletionEventStatuses
> ---
>
> Key: MAPREDUCE-5612
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5612
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Sandy Ryza
>Priority: Minor
>  Labels: newbie
>
> What's the difference between FAILED and TIPFAILED?  What is OBSOLETE?



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (MAPREDUCE-5612) Document TaskAttemptCompletionStatuses

2013-11-07 Thread Sandy Ryza (JIRA)
Sandy Ryza created MAPREDUCE-5612:
-

 Summary: Document TaskAttemptCompletionStatuses
 Key: MAPREDUCE-5612
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5612
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Priority: Minor


What's the difference between FAILED and TIPFAILED?  What is OBSOLETE?



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5612) Document TaskAttemptCompletionEventStatuses

2013-11-07 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-5612:
--

Summary: Document TaskAttemptCompletionEventStatuses  (was: Document 
TaskAttemptCompletionStatuses)

> Document TaskAttemptCompletionEventStatuses
> ---
>
> Key: MAPREDUCE-5612
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5612
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Sandy Ryza
>Priority: Minor
>
> What's the difference between FAILED and TIPFAILED?  What is OBSOLETE?



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-4052) Windows eclpise can not submit the job

2013-11-07 Thread Pavel Ganelin (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816038#comment-13816038
 ] 

Pavel Ganelin commented on MAPREDUCE-4052:
--

It is still an issue with hadoop 2.2.

The following ugly workaround can be used on the client side without modifying 
the hadoop jar files:

{code}
{
Field field = Shell.class.getDeclaredField("WINDOWS");
field.setAccessible(true);
Field modifiersField = Field.class.getDeclaredField("modifiers");
modifiersField.setAccessible(true);
modifiersField.setInt(field, field.getModifiers() & 
~Modifier.FINAL);
field.set(null, false);
}
{
Field field = java.io.File.class.getDeclaredField("pathSeparator");
field.setAccessible(true);
Field modifiersField = Field.class.getDeclaredField("modifiers");
modifiersField.setAccessible(true);
modifiersField.setInt(field, field.getModifiers() & 
~Modifier.FINAL);
field.set(null, ":");
}
{code}



> Windows eclpise can not submit the job
> --
>
> Key: MAPREDUCE-4052
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4052
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission
>Affects Versions: 0.23.1
> Environment: client on the Windows, the the cluster on the suse
>Reporter: xieguiming
>Assignee: xieguiming
> Attachments: MAPREDUCE-4052-0.patch, MAPREDUCE-4052.patch
>
>
> when I use the eclipse on the windows to submit the job. and the 
> applicationmaster throw the exception:
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/mapreduce/v2/app/MRAppMaster
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster
> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> Could not find the main class: 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.  Program will exit.
> The reasion is :
> class Apps addToEnvironment function, use the
> private static final String SYSTEM_PATH_SEPARATOR =
>   System.getProperty("path.separator");
> and will result the MRApplicationMaster classpath use the ";" separator.
> I suggest that nodemanger do the replace.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-11-07 Thread viswanathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

viswanathan updated MAPREDUCE-5508:
---

Attachment: JobInProgress.java
CleanupQueue.java

Hi Chris/Xi,

PFA of Java classes which was modified based on the patches for JT OOME.

Do I miss anything in the Java class? I have modified based on the patches.

In hadoop 1.2.1, FileSystem.java class remains same as in the patch so I didn't 
modified.

Please review and send your feedback.

Thanks,

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Fix For: 1-win, 1.3.0
>
> Attachments: CleanupQueue.java, JobInProgress.java, 
> MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, MAPREDUCE-5508.3.patch, 
> MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-11-07 Thread viswanathan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815863#comment-13815863
 ] 

viswanathan commented on MAPREDUCE-5508:


Hi Chris/Xi,

Hope you are doing great.

As mentioned, have upgraded our HDFS cluster with the following 3 patches for 
JT OOME.

https://issues.apache.org/jira/secure/attachment/12590105/MAPREDUCE-5351-2.patch
https://issues.apache.org/jira/secure/attachment/12590672/MAPREDUCE-5351-addendum-1.patch
https://issues.apache.org/jira/secure/attachment/12604722/MAPREDUCE-5508.3.patch

In the above patches, I didn't add the patch of TestCleanupQueue class. Hope 
its not required.

I have noticed that heap size of Jobtracker is gradually increasing after the 
upgrade also. 

Currently the heap size as follows,
Cluster Summary (Heap Size is 869.19 MB/8.89 GB) - Yesterday while re-starting 
the JT the heap size was only 200MB plus.

Is the heap size will reduce automatically after any threshold limit?

Do I need to update any other patches/fix?

Please do the needful as it is in production environment.

Thanks,
Viswa.J

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Fix For: 1-win, 1.3.0
>
> Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch, 
> MAPREDUCE-5508.3.patch, MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)