[jira] [Updated] (MAPREDUCE-6870) Add configuration for MR job to finish when all reducers are complete (even with unfinished mappers)

2017-09-11 Thread Erik Krogen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated MAPREDUCE-6870:
---
Release Note: Enables {{mapreduce.job.finish-when-all-reducers-done}} by 
default. With this enabled, a MapReduce job will complete as soon as all of its 
reducers are complete, even if some mappers are still running. This can occur 
if a mapper was relaunched after node failure but the relaunched task's output 
is not actually needed. Previously the job would wait for all mappers to 
complete.

> Add configuration for MR job to finish when all reducers are complete (even 
> with unfinished mappers)
> 
>
> Key: MAPREDUCE-6870
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6870
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.6.1
>Reporter: Zhe Zhang
>Assignee: Peter Bacsko
> Fix For: 3.0.0-beta1
>
> Attachments: MAPREDUCE-6870-001.patch, MAPREDUCE-6870-002.patch, 
> MAPREDUCE-6870-003.patch, MAPREDUCE-6870-004.patch, MAPREDUCE-6870-005.patch, 
> MAPREDUCE-6870-006.patch, MAPREDUCE-6870-007.patch
>
>
> Even with MAPREDUCE-5817, there could still be cases where mappers get 
> scheduled before all reducers are complete, but those mappers run for long 
> time, even after all reducers are complete. This could hurt the performance 
> of large MR jobs.
> In some cases, mappers don't have any materialize-able outcome other than 
> providing intermediate data to reducers. In that case, the job owner should 
> have the config option to finish the job once all reducers are complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6870) Add configuration for MR job to finish when all reducers are complete (even with unfinished mappers)

2017-09-11 Thread Erik Krogen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated MAPREDUCE-6870:
---
Release Note: Enables mapreduce.job.finish-when-all-reducers-done by 
default. With this enabled, a MapReduce job will complete as soon as all of its 
reducers are complete, even if some mappers are still running. This can occur 
if a mapper was relaunched after node failure but the relaunched task's output 
is not actually needed. Previously the job would wait for all mappers to 
complete.  (was: Enables {{mapreduce.job.finish-when-all-reducers-done}} by 
default. With this enabled, a MapReduce job will complete as soon as all of its 
reducers are complete, even if some mappers are still running. This can occur 
if a mapper was relaunched after node failure but the relaunched task's output 
is not actually needed. Previously the job would wait for all mappers to 
complete.)

> Add configuration for MR job to finish when all reducers are complete (even 
> with unfinished mappers)
> 
>
> Key: MAPREDUCE-6870
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6870
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.6.1
>Reporter: Zhe Zhang
>Assignee: Peter Bacsko
> Fix For: 3.0.0-beta1
>
> Attachments: MAPREDUCE-6870-001.patch, MAPREDUCE-6870-002.patch, 
> MAPREDUCE-6870-003.patch, MAPREDUCE-6870-004.patch, MAPREDUCE-6870-005.patch, 
> MAPREDUCE-6870-006.patch, MAPREDUCE-6870-007.patch
>
>
> Even with MAPREDUCE-5817, there could still be cases where mappers get 
> scheduled before all reducers are complete, but those mappers run for long 
> time, even after all reducers are complete. This could hurt the performance 
> of large MR jobs.
> In some cases, mappers don't have any materialize-able outcome other than 
> providing intermediate data to reducers. In that case, the job owner should 
> have the config option to finish the job once all reducers are complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6870) Add configuration for MR job to finish when all reducers are complete (even with unfinished mappers)

2017-08-11 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6870:
--
Hadoop Flags: Incompatible change,Reviewed  (was: Reviewed)

> Add configuration for MR job to finish when all reducers are complete (even 
> with unfinished mappers)
> 
>
> Key: MAPREDUCE-6870
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6870
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.6.1
>Reporter: Zhe Zhang
>Assignee: Peter Bacsko
> Fix For: 3.0.0-beta1
>
> Attachments: MAPREDUCE-6870-001.patch, MAPREDUCE-6870-002.patch, 
> MAPREDUCE-6870-003.patch, MAPREDUCE-6870-004.patch, MAPREDUCE-6870-005.patch, 
> MAPREDUCE-6870-006.patch, MAPREDUCE-6870-007.patch
>
>
> Even with MAPREDUCE-5817, there could still be cases where mappers get 
> scheduled before all reducers are complete, but those mappers run for long 
> time, even after all reducers are complete. This could hurt the performance 
> of large MR jobs.
> In some cases, mappers don't have any materialize-able outcome other than 
> providing intermediate data to reducers. In that case, the job owner should 
> have the config option to finish the job once all reducers are complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6870) Add configuration for MR job to finish when all reducers are complete (even with unfinished mappers)

2017-08-10 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6870:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-beta1
   Status: Resolved  (was: Patch Available)

Thanks [~pbacsko] for the patch! I have committed it to trunk.

> Add configuration for MR job to finish when all reducers are complete (even 
> with unfinished mappers)
> 
>
> Key: MAPREDUCE-6870
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6870
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.6.1
>Reporter: Zhe Zhang
>Assignee: Peter Bacsko
> Fix For: 3.0.0-beta1
>
> Attachments: MAPREDUCE-6870-001.patch, MAPREDUCE-6870-002.patch, 
> MAPREDUCE-6870-003.patch, MAPREDUCE-6870-004.patch, MAPREDUCE-6870-005.patch, 
> MAPREDUCE-6870-006.patch, MAPREDUCE-6870-007.patch
>
>
> Even with MAPREDUCE-5817, there could still be cases where mappers get 
> scheduled before all reducers are complete, but those mappers run for long 
> time, even after all reducers are complete. This could hurt the performance 
> of large MR jobs.
> In some cases, mappers don't have any materialize-able outcome other than 
> providing intermediate data to reducers. In that case, the job owner should 
> have the config option to finish the job once all reducers are complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6870) Add configuration for MR job to finish when all reducers are complete (even with unfinished mappers)

2017-08-10 Thread Peter Bacsko (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated MAPREDUCE-6870:

Attachment: MAPREDUCE-6870-007.patch

> Add configuration for MR job to finish when all reducers are complete (even 
> with unfinished mappers)
> 
>
> Key: MAPREDUCE-6870
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6870
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.6.1
>Reporter: Zhe Zhang
>Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6870-001.patch, MAPREDUCE-6870-002.patch, 
> MAPREDUCE-6870-003.patch, MAPREDUCE-6870-004.patch, MAPREDUCE-6870-005.patch, 
> MAPREDUCE-6870-006.patch, MAPREDUCE-6870-007.patch
>
>
> Even with MAPREDUCE-5817, there could still be cases where mappers get 
> scheduled before all reducers are complete, but those mappers run for long 
> time, even after all reducers are complete. This could hurt the performance 
> of large MR jobs.
> In some cases, mappers don't have any materialize-able outcome other than 
> providing intermediate data to reducers. In that case, the job owner should 
> have the config option to finish the job once all reducers are complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6870) Add configuration for MR job to finish when all reducers are complete (even with unfinished mappers)

2017-08-09 Thread Peter Bacsko (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated MAPREDUCE-6870:

Attachment: MAPREDUCE-6870-006.patch

> Add configuration for MR job to finish when all reducers are complete (even 
> with unfinished mappers)
> 
>
> Key: MAPREDUCE-6870
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6870
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.6.1
>Reporter: Zhe Zhang
>Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6870-001.patch, MAPREDUCE-6870-002.patch, 
> MAPREDUCE-6870-003.patch, MAPREDUCE-6870-004.patch, MAPREDUCE-6870-005.patch, 
> MAPREDUCE-6870-006.patch
>
>
> Even with MAPREDUCE-5817, there could still be cases where mappers get 
> scheduled before all reducers are complete, but those mappers run for long 
> time, even after all reducers are complete. This could hurt the performance 
> of large MR jobs.
> In some cases, mappers don't have any materialize-able outcome other than 
> providing intermediate data to reducers. In that case, the job owner should 
> have the config option to finish the job once all reducers are complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6870) Add configuration for MR job to finish when all reducers are complete (even with unfinished mappers)

2017-08-09 Thread Peter Bacsko (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated MAPREDUCE-6870:

Attachment: MAPREDUCE-6870-005.patch

> Add configuration for MR job to finish when all reducers are complete (even 
> with unfinished mappers)
> 
>
> Key: MAPREDUCE-6870
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6870
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.6.1
>Reporter: Zhe Zhang
>Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6870-001.patch, MAPREDUCE-6870-002.patch, 
> MAPREDUCE-6870-003.patch, MAPREDUCE-6870-004.patch, MAPREDUCE-6870-005.patch
>
>
> Even with MAPREDUCE-5817, there could still be cases where mappers get 
> scheduled before all reducers are complete, but those mappers run for long 
> time, even after all reducers are complete. This could hurt the performance 
> of large MR jobs.
> In some cases, mappers don't have any materialize-able outcome other than 
> providing intermediate data to reducers. In that case, the job owner should 
> have the config option to finish the job once all reducers are complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6870) Add configuration for MR job to finish when all reducers are complete (even with unfinished mappers)

2017-08-08 Thread Peter Bacsko (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated MAPREDUCE-6870:

Attachment: MAPREDUCE-6870-004.patch

> Add configuration for MR job to finish when all reducers are complete (even 
> with unfinished mappers)
> 
>
> Key: MAPREDUCE-6870
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6870
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.6.1
>Reporter: Zhe Zhang
>Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6870-001.patch, MAPREDUCE-6870-002.patch, 
> MAPREDUCE-6870-003.patch, MAPREDUCE-6870-004.patch
>
>
> Even with MAPREDUCE-5817, there could still be cases where mappers get 
> scheduled before all reducers are complete, but those mappers run for long 
> time, even after all reducers are complete. This could hurt the performance 
> of large MR jobs.
> In some cases, mappers don't have any materialize-able outcome other than 
> providing intermediate data to reducers. In that case, the job owner should 
> have the config option to finish the job once all reducers are complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6870) Add configuration for MR job to finish when all reducers are complete (even with unfinished mappers)

2017-08-02 Thread Peter Bacsko (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated MAPREDUCE-6870:

Attachment: MAPREDUCE-6870-003.patch

> Add configuration for MR job to finish when all reducers are complete (even 
> with unfinished mappers)
> 
>
> Key: MAPREDUCE-6870
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6870
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.6.1
>Reporter: Zhe Zhang
>Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6870-001.patch, MAPREDUCE-6870-002.patch, 
> MAPREDUCE-6870-003.patch
>
>
> Even with MAPREDUCE-5817, there could still be cases where mappers get 
> scheduled before all reducers are complete, but those mappers run for long 
> time, even after all reducers are complete. This could hurt the performance 
> of large MR jobs.
> In some cases, mappers don't have any materialize-able outcome other than 
> providing intermediate data to reducers. In that case, the job owner should 
> have the config option to finish the job once all reducers are complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6870) Add configuration for MR job to finish when all reducers are complete (even with unfinished mappers)

2017-07-05 Thread Peter Bacsko (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated MAPREDUCE-6870:

Attachment: MAPREDUCE-6870-002.patch

> Add configuration for MR job to finish when all reducers are complete (even 
> with unfinished mappers)
> 
>
> Key: MAPREDUCE-6870
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6870
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.6.1
>Reporter: Zhe Zhang
>Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6870-001.patch, MAPREDUCE-6870-002.patch
>
>
> Even with MAPREDUCE-5817, there could still be cases where mappers get 
> scheduled before all reducers are complete, but those mappers run for long 
> time, even after all reducers are complete. This could hurt the performance 
> of large MR jobs.
> In some cases, mappers don't have any materialize-able outcome other than 
> providing intermediate data to reducers. In that case, the job owner should 
> have the config option to finish the job once all reducers are complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6870) Add configuration for MR job to finish when all reducers are complete (even with unfinished mappers)

2017-06-29 Thread Peter Bacsko (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated MAPREDUCE-6870:

Attachment: MAPREDUCE-6870-001.patch

> Add configuration for MR job to finish when all reducers are complete (even 
> with unfinished mappers)
> 
>
> Key: MAPREDUCE-6870
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6870
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.6.1
>Reporter: Zhe Zhang
>Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6870-001.patch
>
>
> Even with MAPREDUCE-5817, there could still be cases where mappers get 
> scheduled before all reducers are complete, but those mappers run for long 
> time, even after all reducers are complete. This could hurt the performance 
> of large MR jobs.
> In some cases, mappers don't have any materialize-able outcome other than 
> providing intermediate data to reducers. In that case, the job owner should 
> have the config option to finish the job once all reducers are complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6870) Add configuration for MR job to finish when all reducers are complete (even with unfinished mappers)

2017-06-29 Thread Peter Bacsko (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated MAPREDUCE-6870:

Status: Patch Available  (was: Open)

> Add configuration for MR job to finish when all reducers are complete (even 
> with unfinished mappers)
> 
>
> Key: MAPREDUCE-6870
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6870
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.6.1
>Reporter: Zhe Zhang
>Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6870-001.patch
>
>
> Even with MAPREDUCE-5817, there could still be cases where mappers get 
> scheduled before all reducers are complete, but those mappers run for long 
> time, even after all reducers are complete. This could hurt the performance 
> of large MR jobs.
> In some cases, mappers don't have any materialize-able outcome other than 
> providing intermediate data to reducers. In that case, the job owner should 
> have the config option to finish the job once all reducers are complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org