[jira] [Commented] (MAPREDUCE-7123) AM Failed with Communication error to RM

2019-10-29 Thread Amithsha (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962711#comment-16962711
 ] 

Amithsha commented on MAPREDUCE-7123:
-

[~cane] we closed this since the versions are not same (version running on 
container and version running on RM ) May version be  incompatibility.

> AM Failed with Communication error to RM
> 
>
> Key: MAPREDUCE-7123
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7123
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, yarn
>Affects Versions: 2.9.0
>Reporter: Amithsha
>Priority: Major
>
> During the restart of nodemanagers in 300 node cluster some jobs failed with 
> the following exceptions.
> But the nodes where the AM launched is not the part of cluster.
> FATAL [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread 
> java.lang.NullPointerException at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$UpdatedNodesTransition.transition(JobImpl.java:2146)
>  at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$UpdatedNodesTransition.transition(JobImpl.java:2139)
>  at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>  at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>  at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>  at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>  at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:998) 
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138) 
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1346)
>  at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1342)
>  at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>  at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) 
> at java.lang.Thread.run(Thread.java:745) 2018-07-14 12:34:53,425 ERROR 
> [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: ERROR IN CONTACTING RM. 
> java.lang.NullPointerException at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.handleUpdatedNodes(RMContainerAllocator.java:875)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:776)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:256)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$AllocatorRunnable.run(RMCommunicator.java:281)
>  at java.lang.Thread.run(Thread.java:745) 2018-07-14 12:34:53,427 INFO 
> [AsyncDispatcher ShutDown handler] 
> org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7245) Reduce phase does not continue processing with failed SCHEDULED Map tasks

2019-10-29 Thread Sultan Alamro (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sultan Alamro updated MAPREDUCE-7245:
-
Affects Version/s: 3.2.1

> Reduce phase does not continue processing with failed SCHEDULED Map tasks
> -
>
> Key: MAPREDUCE-7245
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7245
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2, 3.2.1
>Reporter: Sultan Alamro
>Priority: Major
>
> When we set *mapreduce.map.maxattempts* to 1, the reduce tasks should ignore 
> the output of failed tasks as state it in EventFetch class. However, it turns 
> out that this only happens when a map task transitions from RUNNING to 
> FAILED, not from SCHEDULED to FAILED



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator

2019-10-29 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962665#comment-16962665
 ] 

Hadoop QA commented on MAPREDUCE-7208:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  6m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 11s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  9m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m  
2s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
33s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 
40s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m  0s{color} | {color:orange} root: The patch generated 19 new + 1195 
unchanged - 8 fixed = 1214 total (was 1203) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  7m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 0s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m 
39s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
1s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 15s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 10m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 37s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 59s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m  4s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  5m 
37s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 
25s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 32s{color} 
| 

[jira] [Commented] (MAPREDUCE-7247) Modify HistoryServerRest.html content,change The job attempt id‘s datatype from string to int.

2019-10-29 Thread zhaoshengjie (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962571#comment-16962571
 ] 

zhaoshengjie commented on MAPREDUCE-7247:
-

I have fixed the document error

> Modify HistoryServerRest.html content,change The job attempt id‘s datatype 
> from string to int.
> --
>
> Key: MAPREDUCE-7247
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7247
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.2.1
>Reporter: zhaoshengjie
>Priority: Trivial
> Attachments: image-2019-10-29-14-46-17-354.png, 
> image-2019-10-29-14-46-49-929.png
>
>
> The Job Attempts API 
> http://history-server-http-address:port/ws/v1/history/mapreduce/jobs/\{jobid}/jobattempts
>  document, In 
> http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/HistoryServerRest.html#Job_Attempts_API,
>  change The job attempt id‘s datatype from string to int.
> !image-2019-10-29-14-46-17-354.png|width=508,height=126!
> !image-2019-10-29-14-46-49-929.png|width=465,height=315!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator

2019-10-29 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962437#comment-16962437
 ] 

Ahmed Hussein commented on MAPREDUCE-7208:
--

Thanks [~jeagles]. I looked at the test cases:
* {{hadoop.mapreduce.v2.TestSpeculativeExecutionWithMRApp}} is a related test 
case and It was failing because I changed the threshold of the estimate that 
triggers a new speculative task. I fixed that default behavior in the new patch.
* {{hadoop.mapred.TestLocalMRNotification}} and 
{{hadoop.mapreduce.v2.TestMROldApiJobs}} seem to be a random failure. They pass 
successfully on local machine.

> Tuning TaskRuntimeEstimator 
> 
>
> Key: MAPREDUCE-7208
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7208
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: MAPREDUCE-7208.001.patch, MAPREDUCE-7208.002.patch, 
> smoothing-exponential.md
>
>
> By default, MR uses LegacyTaskRuntimeEstimator to get an estimate of the 
> runtime.  The estimator does not adjust dynamically to the progress rate of 
> the tasks. On the other hand, the existing alternative 
> "ExponentiallySmoothedTaskRuntimeEstimator" behavior in unpredictable.
>  
> There are several dimensions to improve the exponential implementation:
>  # Exponential shooting needs a warmup period. Otherwise, the estimate will 
> be affected by the initial values.
>  # Using a single smoothing factor (Lambda) does not work well for all the 
> tasks. To increase the level of smoothing across the majority of tasks, we 
> need to give a range of flexibility to dynamically adjust the smoothing 
> factor based on the history of the task progress.
>  # Design wise, it is better to separate between the statistical model and 
> the MR interface. We need to have a way to evaluate estimators statistically, 
> without the need to run MR. For example, an estimator can be evaluated as a 
> black box by using a stream of raw data as input and testing the accuracy of 
> the generated stream of estimates.
>  # The exponential estimator speculates frequently and fails to detect 
> slowing tasks. It does not detect slowing tasks. As a result, a taskAttempt 
> that does not do any progress won't trigger a new speculation.
>  
> The file [^smoothing-exponential.md] describes how Simple Exponential 
> smoothing factor works.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7208) Tuning TaskRuntimeEstimator

2019-10-29 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated MAPREDUCE-7208:
-
Attachment: MAPREDUCE-7208.002.patch

> Tuning TaskRuntimeEstimator 
> 
>
> Key: MAPREDUCE-7208
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7208
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: MAPREDUCE-7208.001.patch, MAPREDUCE-7208.002.patch, 
> smoothing-exponential.md
>
>
> By default, MR uses LegacyTaskRuntimeEstimator to get an estimate of the 
> runtime.  The estimator does not adjust dynamically to the progress rate of 
> the tasks. On the other hand, the existing alternative 
> "ExponentiallySmoothedTaskRuntimeEstimator" behavior in unpredictable.
>  
> There are several dimensions to improve the exponential implementation:
>  # Exponential shooting needs a warmup period. Otherwise, the estimate will 
> be affected by the initial values.
>  # Using a single smoothing factor (Lambda) does not work well for all the 
> tasks. To increase the level of smoothing across the majority of tasks, we 
> need to give a range of flexibility to dynamically adjust the smoothing 
> factor based on the history of the task progress.
>  # Design wise, it is better to separate between the statistical model and 
> the MR interface. We need to have a way to evaluate estimators statistically, 
> without the need to run MR. For example, an estimator can be evaluated as a 
> black box by using a stream of raw data as input and testing the accuracy of 
> the generated stream of estimates.
>  # The exponential estimator speculates frequently and fails to detect 
> slowing tasks. It does not detect slowing tasks. As a result, a taskAttempt 
> that does not do any progress won't trigger a new speculation.
>  
> The file [^smoothing-exponential.md] describes how Simple Exponential 
> smoothing factor works.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7247) Modify HistoryServerRest.html content,change The job attempt id‘s datatype from string to int.

2019-10-29 Thread zhaoshengjie (Jira)
zhaoshengjie created MAPREDUCE-7247:
---

 Summary: Modify HistoryServerRest.html content,change The job 
attempt id‘s datatype from string to int.
 Key: MAPREDUCE-7247
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7247
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: documentation
Affects Versions: 3.2.1
Reporter: zhaoshengjie
 Attachments: image-2019-10-29-14-46-17-354.png, 
image-2019-10-29-14-46-49-929.png

The Job Attempts API 
http://history-server-http-address:port/ws/v1/history/mapreduce/jobs/\{jobid}/jobattempts
 document, In 
http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/HistoryServerRest.html#Job_Attempts_API,
 change The job attempt id‘s datatype from string to int.

!image-2019-10-29-14-46-17-354.png|width=508,height=126!

!image-2019-10-29-14-46-49-929.png|width=465,height=315!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org