[jira] [Commented] (MAPREDUCE-5014) Extending DistCp through a custom CopyListing is not possible

2013-03-17 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604861#comment-13604861
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-5014:


CopyListing class has a static factory method getCopyListing(). We should make 
the factory method configuration based, sothat users can plugin the custom 
CopyListing without extending the Tool itself.



> Extending DistCp through a custom CopyListing is not possible
> -
>
> Key: MAPREDUCE-5014
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5014
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: distcp
>Affects Versions: 0.23.0, 0.23.1, 0.23.3, trunk, 0.23.4, 0.23.5
>Reporter: Srikanth Sundarrajan
>Assignee: Srikanth Sundarrajan
> Attachments: MAPREDUCE-5014.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> * While it is possible to implement a custom CopyListing in DistCp, DistCp 
> driver class doesn't allow for using this custom CopyListing.
> * Allow SimpleCopyListing to provide an option to exclude files (For instance 
> it is useful to exclude FileOutputCommiter.SUCCEEDED_FILE_NAME during copy as 
> premature copy can indicate that the entire data is available at the 
> destination)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-17 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5066:
--

Attachment: MAPREDUCE-5066.branch-1-win.2.patch

Minor patch update, factoring common unittest code into utility methods.

> JobTracker should set a timeout when calling into job.end.notification.url
> --
>
> Key: MAPREDUCE-5066
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-5066.branch-1-win.2.patch, 
> MAPREDUCE-5066.branch-1-win.patch
>
>
> In current code, timeout is not specified when JobTracker (JobEndNotifier) 
> calls into the notification URL. When the given URL points to a server that 
> will not respond for a long time, job notifications are completely stuck 
> (given that we have only a single thread processing all notifications). We've 
> seen this cause noticeable delays in job execution in components that rely on 
> job end notifications (like Oozie workflows). 
> I propose we introduce a configurable timeout option and set a default to a 
> reasonably small value.
> If we want, we can also introduce a configurable number of workers processing 
> the notification queue (not sure if this is needed though at this point).
> I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-17 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5066:
--

Status: Open  (was: Patch Available)

> JobTracker should set a timeout when calling into job.end.notification.url
> --
>
> Key: MAPREDUCE-5066
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.3-alpha, 1-win, 1.3.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-5066.branch-1-win.patch
>
>
> In current code, timeout is not specified when JobTracker (JobEndNotifier) 
> calls into the notification URL. When the given URL points to a server that 
> will not respond for a long time, job notifications are completely stuck 
> (given that we have only a single thread processing all notifications). We've 
> seen this cause noticeable delays in job execution in components that rely on 
> job end notifications (like Oozie workflows). 
> I propose we introduce a configurable timeout option and set a default to a 
> reasonably small value.
> If we want, we can also introduce a configurable number of workers processing 
> the notification queue (not sure if this is needed though at this point).
> I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5065) DistCp should skip checksum comparisons if block-sizes are different on source/target.

2013-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604815#comment-13604815
 ] 

Hadoop QA commented on MAPREDUCE-5065:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12574096/MAPREDUCE-5065.branch-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 one of tests included doesn't have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-distcp.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3424//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3424//console

This message is automatically generated.

> DistCp should skip checksum comparisons if block-sizes are different on 
> source/target.
> --
>
> Key: MAPREDUCE-5065
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5065
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.0.3-alpha, 0.23.5
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: MAPREDUCE-5065.branch-0.23.patch, 
> MAPREDUCE-5065.branch-2.patch
>
>
> When copying files between 2 clusters with different default block-sizes, one 
> sees that the copy fails with a checksum-mismatch, even though the files have 
> identical contents.
> The reason is that on HDFS, a file's checksum is unfortunately a function of 
> the block-size of the file. So you could have 2 different files with 
> identical contents (but different block-sizes) have different checksums. 
> (Thus, it's also possible for DistCp to fail to copy files on the same 
> file-system, if the source-file's block-size differs from HDFS default, and 
> -pb isn't used.)
> I propose that we skip checksum comparisons under the following conditions:
> 1. -skipCrc is specified.
> 2. File-size is 0 (in which case the call to the checksum-servlet is moot).
> 3. source.getBlockSize() != target.getBlockSize(), since the checksums are 
> guaranteed to differ in this case.
> I have a patch for #3.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5065) DistCp should skip checksum comparisons if block-sizes are different on source/target.

2013-03-17 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated MAPREDUCE-5065:


Status: Patch Available  (was: Open)

> DistCp should skip checksum comparisons if block-sizes are different on 
> source/target.
> --
>
> Key: MAPREDUCE-5065
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5065
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 0.23.5, 2.0.3-alpha
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: MAPREDUCE-5065.branch-0.23.patch, 
> MAPREDUCE-5065.branch-2.patch
>
>
> When copying files between 2 clusters with different default block-sizes, one 
> sees that the copy fails with a checksum-mismatch, even though the files have 
> identical contents.
> The reason is that on HDFS, a file's checksum is unfortunately a function of 
> the block-size of the file. So you could have 2 different files with 
> identical contents (but different block-sizes) have different checksums. 
> (Thus, it's also possible for DistCp to fail to copy files on the same 
> file-system, if the source-file's block-size differs from HDFS default, and 
> -pb isn't used.)
> I propose that we skip checksum comparisons under the following conditions:
> 1. -skipCrc is specified.
> 2. File-size is 0 (in which case the call to the checksum-servlet is moot).
> 3. source.getBlockSize() != target.getBlockSize(), since the checksums are 
> guaranteed to differ in this case.
> I have a patch for #3.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5065) DistCp should skip checksum comparisons if block-sizes are different on source/target.

2013-03-17 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated MAPREDUCE-5065:


Attachment: MAPREDUCE-5065.branch-2.patch
MAPREDUCE-5065.branch-0.23.patch

Modified patch not to skip CRC-checks, and instead suggest using either -pb or 
-skipCrc, in case of checksum difference.

Updated tests for the same.

> DistCp should skip checksum comparisons if block-sizes are different on 
> source/target.
> --
>
> Key: MAPREDUCE-5065
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5065
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.0.3-alpha, 0.23.5
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: MAPREDUCE-5065.branch-0.23.patch, 
> MAPREDUCE-5065.branch-2.patch
>
>
> When copying files between 2 clusters with different default block-sizes, one 
> sees that the copy fails with a checksum-mismatch, even though the files have 
> identical contents.
> The reason is that on HDFS, a file's checksum is unfortunately a function of 
> the block-size of the file. So you could have 2 different files with 
> identical contents (but different block-sizes) have different checksums. 
> (Thus, it's also possible for DistCp to fail to copy files on the same 
> file-system, if the source-file's block-size differs from HDFS default, and 
> -pb isn't used.)
> I propose that we skip checksum comparisons under the following conditions:
> 1. -skipCrc is specified.
> 2. File-size is 0 (in which case the call to the checksum-servlet is moot).
> 3. source.getBlockSize() != target.getBlockSize(), since the checksums are 
> guaranteed to differ in this case.
> I have a patch for #3.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604766#comment-13604766
 ] 

Hadoop QA commented on MAPREDUCE-5066:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12574085/MAPREDUCE-5066.branch-1-win.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3423//console

This message is automatically generated.

> JobTracker should set a timeout when calling into job.end.notification.url
> --
>
> Key: MAPREDUCE-5066
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-5066.branch-1-win.patch
>
>
> In current code, timeout is not specified when JobTracker (JobEndNotifier) 
> calls into the notification URL. When the given URL points to a server that 
> will not respond for a long time, job notifications are completely stuck 
> (given that we have only a single thread processing all notifications). We've 
> seen this cause noticeable delays in job execution in components that rely on 
> job end notifications (like Oozie workflows). 
> I propose we introduce a configurable timeout option and set a default to a 
> reasonably small value.
> If we want, we can also introduce a configurable number of workers processing 
> the notification queue (not sure if this is needed though at this point).
> I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-17 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5066:
--

Status: Patch Available  (was: Open)

> JobTracker should set a timeout when calling into job.end.notification.url
> --
>
> Key: MAPREDUCE-5066
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.3-alpha, 1-win, 1.3.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-5066.branch-1-win.patch
>
>
> In current code, timeout is not specified when JobTracker (JobEndNotifier) 
> calls into the notification URL. When the given URL points to a server that 
> will not respond for a long time, job notifications are completely stuck 
> (given that we have only a single thread processing all notifications). We've 
> seen this cause noticeable delays in job execution in components that rely on 
> job end notifications (like Oozie workflows). 
> I propose we introduce a configurable timeout option and set a default to a 
> reasonably small value.
> If we want, we can also introduce a configurable number of workers processing 
> the notification queue (not sure if this is needed though at this point).
> I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-17 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604764#comment-13604764
 ] 

Ivan Mitic commented on MAPREDUCE-5066:
---

bq. Job notification also exists in 2.x which may face the same set of issues.
Thanks Hitesh, it should be strait forward to rebase the patch for 2.x branch. 
Will do so once the current patch is reviewed.

> JobTracker should set a timeout when calling into job.end.notification.url
> --
>
> Key: MAPREDUCE-5066
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-5066.branch-1-win.patch
>
>
> In current code, timeout is not specified when JobTracker (JobEndNotifier) 
> calls into the notification URL. When the given URL points to a server that 
> will not respond for a long time, job notifications are completely stuck 
> (given that we have only a single thread processing all notifications). We've 
> seen this cause noticeable delays in job execution in components that rely on 
> job end notifications (like Oozie workflows). 
> I propose we introduce a configurable timeout option and set a default to a 
> reasonably small value.
> If we want, we can also introduce a configurable number of workers processing 
> the notification queue (not sure if this is needed though at this point).
> I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-17 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5066:
--

Attachment: MAPREDUCE-5066.branch-1-win.patch

Attaching the branch-1 compatible patch. 

A few notes:
 - Introduced missing unittests for the JobEndNotifier that cover most of its 
functionality
 - Added a test case that targets the problem from the Jira
 - Fixed a bug in how retry count it computed (we had an extra retry attempt 
previously)

> JobTracker should set a timeout when calling into job.end.notification.url
> --
>
> Key: MAPREDUCE-5066
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-5066.branch-1-win.patch
>
>
> In current code, timeout is not specified when JobTracker (JobEndNotifier) 
> calls into the notification URL. When the given URL points to a server that 
> will not respond for a long time, job notifications are completely stuck 
> (given that we have only a single thread processing all notifications). We've 
> seen this cause noticeable delays in job execution in components that rely on 
> job end notifications (like Oozie workflows). 
> I propose we introduce a configurable timeout option and set a default to a 
> reasonably small value.
> If we want, we can also introduce a configurable number of workers processing 
> the notification queue (not sure if this is needed though at this point).
> I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3916) various issues with running yarn proxyserver

2013-03-17 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604617#comment-13604617
 ] 

Suresh Srinivas commented on MAPREDUCE-3916:


Maybe svn add was not done before creating the patch? Since Jenkins did not 
have the right patch, hopefully this did not cause any issues. 

> various issues with running yarn proxyserver
> 
>
> Key: MAPREDUCE-3916
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3916
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, resourcemanager, webapps
>Affects Versions: 0.23.1, 2.0.0-alpha, 3.0.0
>Reporter: Roman Shaposhnik
>Assignee: Devaraj K
>Priority: Critical
>  Labels: mrv2
> Fix For: 2.0.0-alpha
>
> Attachments: MAPREDUCE-3916.patch
>
>
> Seem like yarn proxyserver is not operational when running out of the 0.23.1 
> RC2 tarball.
> # Setting yarn.web-proxy.address to match yarn.resourcemanager.address 
> doesn't disable the proxyserver (althought not setting yarn.web-proxy.address 
> at all correctly disable it and produces a message: 
> org.apache.hadoop.yarn.YarnException: yarn.web-proxy.address is not set so 
> the proxy will not run). This contradicts the documentation provided for 
> yarn.web-proxy.address in yarn-default.xml
> # Setting yarn.web-proxy.address and running the service results in the 
> following:
> {noformat}
> $ ./sbin/yarn-daemon.sh start proxyserver 
> starting proxyserver, logging to 
> /tmp/hadoop-0.23.1/logs/yarn-rvs-proxyserver-ahmed-laptop.out
> /usr/java/64/jdk1.6.0_22/bin/java -Dproc_proxyserver -Xmx1000m 
> -Dhadoop.log.dir=/tmp/hadoop-0.23.1/logs 
> -Dyarn.log.dir=/tmp/hadoop-0.23.1/logs 
> -Dhadoop.log.file=yarn-rvs-proxyserver-ahmed-laptop.log 
> -Dyarn.log.file=yarn-rvs-proxyserver-ahmed-laptop.log -Dyarn.home.dir= 
> -Dyarn.id.str=rvs -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA 
> -Djava.library.path=/tmp/hadoop-0.23.1/lib/native 
> -Dyarn.policy.file=hadoop-policy.xml -Dhadoop.log.dir=/tmp/hadoop-0.23.1/logs 
> -Dyarn.log.dir=/tmp/hadoop-0.23.1/logs 
> -Dhadoop.log.file=yarn-rvs-proxyserver-ahmed-laptop.log 
> -Dyarn.log.file=yarn-rvs-proxyserver-ahmed-laptop.log 
> -Dyarn.home.dir=/tmp/hadoop-0.23.1 -Dhadoop.root.logger=INFO,DRFA 
> -Dyarn.root.logger=INFO,DRFA 
> -Djava.library.path=/tmp/hadoop-0.23.1/lib/native -classpath 
> /tmp/hadoop-0.23.1/etc/hadoop:/tmp/hadoop-0.23.1/etc/hadoop:/tmp/hadoop-0.23.1/etc/hadoop:/tmp/hadoop-0.23.1/share/hadoop/common/lib/*:/tmp/hadoop-0.23.1/share/hadoop/common/*:/tmp/hadoop-0.23.1/share/hadoop/hdfs:/tmp/hadoop-0.23.1/share/hadoop/hdfs/lib/*:/tmp/hadoop-0.23.1/share/hadoop/hdfs/*:/tmp/hadoop-0.23.1/share/hadoop/mapreduce/lib/*:/tmp/hadoop-0.23.1/share/hadoop/mapreduce/*:/tmp/hadoop-0.23.1/share/hadoop/mapreduce/*:/tmp/hadoop-0.23.1/share/hadoop/mapreduce/lib/*
>  org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer
> {noformat}
> with the following message found in the logs:
> {noformat}
> 2012-02-24 09:26:31,099 FATAL 
> org.apache.hadoop.yarn.server.webproxy.WebAppProxy: Could not start proxy web 
> server
> java.io.FileNotFoundException: webapps/proxy not found in CLASSPATH
> at 
> org.apache.hadoop.http.HttpServer.getWebAppsPath(HttpServer.java:532)
> at org.apache.hadoop.http.HttpServer.(HttpServer.java:224)
> at org.apache.hadoop.http.HttpServer.(HttpServer.java:164)
> at 
> org.apache.hadoop.yarn.server.webproxy.WebAppProxy.start(WebAppProxy.java:85)
> at 
> org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
> at 
> org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer.main(WebAppProxyServer.java:76)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5026) For shortening the time of TaskTracker heartbeat, decouple the statics collection operations

2013-03-17 Thread sam liu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604561#comment-13604561
 ] 

sam liu commented on MAPREDUCE-5026:


Hi Andrew,

Sorry for replying late. 

In a simplest test: in one node cluster, the average time of executing 
'HeartbeatResponse heartbeatResponse = transmitHeartBeat(now)' will spend 4.3 
ms using the original TaskTracker.java, and the average time is from 1000 
heartbeats. But, using the new TaskTracker.java, the time will be 2.1 ms in 
average. The efficiency improves a little more than 100%.

> For shortening the time of TaskTracker heartbeat, decouple the statics 
> collection operations
> 
>
> Key: MAPREDUCE-5026
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5026
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: performance, tasktracker
>Affects Versions: 1.1.1
>Reporter: sam liu
>  Labels: patch
> Fix For: 1.1.1
>
> Attachments: HDFS-4527.patch, HDFS-4527.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> In each heartbeat of TaskTracker, it will calculate some system statics, like 
> the free disk space, available virtual/physical memory, cpu usage, etc. 
> However, it's not necessary to calculate all the statics in every heartbeat, 
> and this will consume many system resource and impace the performance of 
> TaskTracker heartbeat. Furthermore, the characteristics of system 
> properties(disk, memory, cpu) are different and it's better to collect their 
> statics in different intervals.
> To reduce the latency of TaskTracker heartbeat, one solution is to decouple 
> all the system statics collection operations from it, and issue separate 
> threads to do the statics collection works when the TaskTracker starts. The 
> threads could be three: the first one is to collect cpu related statics in a 
> short interval; the second one is to collect memory related statics in a 
> normal interval; the third one is to collect disk related statics in a long 
> interval. And all the interval could be customized by the parameter 
> "mapred.stats.collection.interval" in the mapred-site.xml. At last, the 
> heartbeat could get values of system statics from the memory directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5026) For shortening the time of TaskTracker heartbeat, decouple the statics collection operations

2013-03-17 Thread sam liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam liu updated MAPREDUCE-5026:
---

Attachment: HDFS-4527.patch

replace Statics with Statistics

> For shortening the time of TaskTracker heartbeat, decouple the statics 
> collection operations
> 
>
> Key: MAPREDUCE-5026
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5026
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: performance, tasktracker
>Affects Versions: 1.1.1
>Reporter: sam liu
>  Labels: patch
> Fix For: 1.1.1
>
> Attachments: HDFS-4527.patch, HDFS-4527.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> In each heartbeat of TaskTracker, it will calculate some system statics, like 
> the free disk space, available virtual/physical memory, cpu usage, etc. 
> However, it's not necessary to calculate all the statics in every heartbeat, 
> and this will consume many system resource and impace the performance of 
> TaskTracker heartbeat. Furthermore, the characteristics of system 
> properties(disk, memory, cpu) are different and it's better to collect their 
> statics in different intervals.
> To reduce the latency of TaskTracker heartbeat, one solution is to decouple 
> all the system statics collection operations from it, and issue separate 
> threads to do the statics collection works when the TaskTracker starts. The 
> threads could be three: the first one is to collect cpu related statics in a 
> short interval; the second one is to collect memory related statics in a 
> normal interval; the third one is to collect disk related statics in a long 
> interval. And all the interval could be customized by the parameter 
> "mapred.stats.collection.interval" in the mapred-site.xml. At last, the 
> heartbeat could get values of system statics from the memory directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5026) For shortening the time of TaskTracker heartbeat, decouple the statics collection operations

2013-03-17 Thread sam liu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604555#comment-13604555
 ] 

sam liu commented on MAPREDUCE-5026:


Hi Tian Hong,

As a general rule, their rates of change are different:  cpu changes faster 
than memory, and memory changes faster than disk. So the frequency of fetching 
cpu should be more than memory, and memory should be more than disk. Finally, 
we could reasonably reduce resource consumption on the node running tasktracker.

> For shortening the time of TaskTracker heartbeat, decouple the statics 
> collection operations
> 
>
> Key: MAPREDUCE-5026
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5026
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: performance, tasktracker
>Affects Versions: 1.1.1
>Reporter: sam liu
>  Labels: patch
> Fix For: 1.1.1
>
> Attachments: HDFS-4527.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> In each heartbeat of TaskTracker, it will calculate some system statics, like 
> the free disk space, available virtual/physical memory, cpu usage, etc. 
> However, it's not necessary to calculate all the statics in every heartbeat, 
> and this will consume many system resource and impace the performance of 
> TaskTracker heartbeat. Furthermore, the characteristics of system 
> properties(disk, memory, cpu) are different and it's better to collect their 
> statics in different intervals.
> To reduce the latency of TaskTracker heartbeat, one solution is to decouple 
> all the system statics collection operations from it, and issue separate 
> threads to do the statics collection works when the TaskTracker starts. The 
> threads could be three: the first one is to collect cpu related statics in a 
> short interval; the second one is to collect memory related statics in a 
> normal interval; the third one is to collect disk related statics in a long 
> interval. And all the interval could be customized by the parameter 
> "mapred.stats.collection.interval" in the mapred-site.xml. At last, the 
> heartbeat could get values of system statics from the memory directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira