[jira] [Created] (MAPREDUCE-7187) RMContainerAllocator.ScheduledRequests#getContainerReqToReplace may not find a task when the priority of container is PRIORITY_MAP

2019-02-28 Thread Zhizhen Hou (JIRA)
Zhizhen Hou created MAPREDUCE-7187:
--

 Summary: 
RMContainerAllocator.ScheduledRequests#getContainerReqToReplace may not find a 
task when the priority of container is PRIORITY_MAP
 Key: MAPREDUCE-7187
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7187
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.7.5
Reporter: Zhizhen Hou


The resource manager may has allocated a map container on a host ("h1" for 
example) for a application, and the container has not been fetched by the 
MRAppMaster. At this time, the  MRAppMaster receives a task fail event, and the 
task is on host h1. The event cause the h1 blacklisted. Now the MRAppMaster 
send a heartbeat, and receive a container on h1. The MRAppMaster can not assign 
the container since it is on a blacklisted host. The #getContainerReqToReplace 
fails returning  another task, may cause a map task hang forever.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-7081) Default speculator won't speculate the last several submitted reduced task if the total task num is large

2019-02-25 Thread Zhizhen Hou (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhizhen Hou resolved MAPREDUCE-7081.

Resolution: Invalid

> Default speculator won't speculate the last several submitted reduced task if 
> the total task num is large
> -
>
> Key: MAPREDUCE-7081
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7081
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 2.9.0, 2.7.5
>Reporter: Zhizhen Hou
>Priority: Major
>
> DefaultSpeculator speculates a task one time.  By default, the number of 
> speculators is max(max(10, 0.01 * tasks.size), 0.1 * running tasks).
> I  set mapreduce.job.reduce.slowstart.completedmaps = 1 to start reduce after 
> all the map tasks are finished. The cluster has 1000 vcores, and the Job has 
> 5000 reduce jobs. At first, 1000 reduces tasks can run simultaneously, number 
> of speculators can speculator at most is 0.1 * 1000 = 100 tasks. Reduce tasks 
> with less data can over shortly, and speculator will speculator a task per 
> second by default. The task be speculated execution may be because the more 
> data to be processed. It will speculator  100 tasks within 100 seconds. When 
> 4900 reduces is over, If a reduce is executed with a lot of  data be 
> processed and is put on a slow machine. The speculate opportunity is running 
> out, it will not be speculated. It can increase the execution time of job 
> significantly.
> In short, it may waste the speculate opportunity at first only because the 
> execution time of  reduce with less data to be processed as average time. At  
> end of job, there is no speculate opportunity available, especially last 
> several running tasks, judged the number of the running tasks .  
> In my opinion, the number of running tasks should not determine the number of 
> speculate opportunity .The number of tasks be speculated can be judged by 
> square of finished task percent. Take an example, if ninety percent of  the 
> task is finished, only 0.9*0.9 = 0.81 speculate opportunity can be used. It 
> will leave enough opportunity for latter tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7081) Default speculator won't sepculate the last several submitted reduced task if the total task num is large

2018-04-16 Thread Zhizhen Hou (JIRA)
Zhizhen Hou created MAPREDUCE-7081:
--

 Summary: Default speculator won't sepculate the last several 
submitted reduced task if the total task num is large
 Key: MAPREDUCE-7081
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7081
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 2.7.5, 2.9.0
Reporter: Zhizhen Hou


DefaultSpeculator speculates a task one time.  By default, the number of 
speculators is max(max(10, 0.01 * tasks.size), 0.1 * running tasks).

I  set mapreduce.job.reduce.slowstart.completedmaps = 1 to start reduce after 
all the map tasks are finished. The cluster has 1000 vcores, and the Job has 
5000 reduce jobs. At first, 1000 reduces tasks can run simultaneously, number 
of speculators can speculator at most is 0.1 * 1000 = 100 tasks. Reduce tasks 
with less data can over shortly, and speculator will speculator a task per 
second by default. The task be speculated execution may be because the more 
data to be processed. It will speculator  100 tasks within 100 seconds. When 
4900 reduces is over, If a reduce is executed with a lot of  data be processed 
and is put on a slow machine. The speculate opportunity is running out, it will 
not be speculated. It can increase the execution time of job significantly.

In short, it may waste the speculate opportunity at first only because the 
execution time of  reduce with less data to be processed as average time. At  
end of job, there is no speculate opportunity available, especially last 
several running tasks, judged the number of the running tasks .  

In my opinion, the number of running tasks should not determine the number of 
speculate opportunity .The number of tasks be speculated can be judged by 
square of finished task percent. Take an example, if ninety percent of  the 
task is finished, only 0.9*0.9 = 0.81 speculate opportunity can be used. It 
will leave enough opportunity for latter tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7080) Default speculator won't sepculate the last several submitted reduced task if the total task num is large

2018-04-16 Thread Zhizhen Hou (JIRA)
Zhizhen Hou created MAPREDUCE-7080:
--

 Summary: Default speculator won't sepculate the last several 
submitted reduced task if the total task num is large
 Key: MAPREDUCE-7080
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7080
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 2.7.5
Reporter: Zhizhen Hou


DefaultSpeculator speculates a task one time. 

By default, the number of speculators is max(max(10, 0.01 * tasks.size), 0.1 * 
running tasks)

I  set mapreduce.job.reduce.slowstart.completedmaps = 1 to start reduce after 
all the map tasks are finished.

The cluster has 1000 vcores, and the Job has 5000 reduce jobs.

At first, 1000 reduces tasks can run simultaneously, number of speculators can 
speculator at most is 0.1 * 1000 = 100 tasks. Reduce tasks with less data can 
over shortly, and speculator will speculator a task per second by default. The 
task be speculated execution may be because the more data to be processed. It 
will speculator  100 tasks within 100 seconds.

When 4900 reduces is over, If a reduce is executed with a lot of  data be 
processed and is put on a slow machine. The speculate opportunity is running 
out, it will not be speculated. It can increase the execution time of job 
significantly.

In short, it may waste the speculate opportunity at first only because the 
execution time of  reduce with less data to be processed as average time. At  
end of job, there is no speculate opportunity available, especially last 
several running tasks, judged the number of the running tasks .

 

In my opinion, the number of tasks be speculated can be judged by square of 
finished task percent. Take an example, if ninety percent of  the task is 
finished, only 0.9*0.9 = 0.81 speculate opportunity can be used. It will leave 
enough opportunity for latter tasks.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7075) Change default configuration for mapreduce.reduce.shuffle.input.buffer.percent

2018-04-10 Thread Zhizhen Hou (JIRA)
Zhizhen Hou created MAPREDUCE-7075:
--

 Summary: Change default configuration for 
mapreduce.reduce.shuffle.input.buffer.percent
 Key: MAPREDUCE-7075
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7075
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 2.9.1
Reporter: Zhizhen Hou


When the default value of mapreduce.reduce.shuffle.input.buffer.percent is 0.7, 
it may report OOM exception at shuffle stage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org