subject:"\[jira\] \[Updated\] \(MAPREDUCE\-6858\) HistoryFileManager thrashing due to high volume jobs"

[jira] [Updated] (MAPREDUCE-6858) HistoryFileManager thrashing due to high volume jobs

2017-03-07 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated MAPREDUCE-6858:

Description: 
- JHS scans "done_intermediate" dir for files to process and adds them to a 
thread pool
- Thread pool starts processing these files to move them to "done" dir
- JHS scans "done_intermediate" again for files to process and adds them to a 
thread pool
-- If we have enough jobs where the thread pool can't keep up with the scanning 
interval, they'll get added twice (or more). If this keeps compounding, I 
wouldn't be surprised if jobs end up piling up and not getting processed for 
quite some time and getting lots of FileNotFoundException's.

By default, it looks like the thread pool only has 3 threads in it 
(mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes 
(mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these?

  was:
- JHS scans "done_intermediate" dir for files to process and adds them to a 
thread pool
- Thread pool starts processing these files to move them to "done" dir
JHS scans "done_intermediate" again for files to process and adds them to a 
thread pool
-- If we have enough jobs where the thread pool can't keep up with the scanning 
interval, they'll get added twice (or more). If this keeps compounding, I 
wouldn't be surprised if jobs end up piling up and not getting processed for 
quite some time and getting lots of FileNotFoundException's.

By default, it looks like the thread pool only has 3 threads in it 
(mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes 
(mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these?


> HistoryFileManager thrashing due to high volume jobs 
> -
>
> Key: MAPREDUCE-6858
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6858
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Yufei Gu
>
> - JHS scans "done_intermediate" dir for files to process and adds them to a 
> thread pool
> - Thread pool starts processing these files to move them to "done" dir
> - JHS scans "done_intermediate" again for files to process and adds them to a 
> thread pool
> -- If we have enough jobs where the thread pool can't keep up with the 
> scanning interval, they'll get added twice (or more). If this keeps 
> compounding, I wouldn't be surprised if jobs end up piling up and not getting 
> processed for quite some time and getting lots of FileNotFoundException's.
> By default, it looks like the thread pool only has 3 threads in it 
> (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes 
> (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6858) HistoryFileManager thrashing due to high volume jobs

2017-03-07 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated MAPREDUCE-6858:

Description: 
- JHS scans "done_intermediate" dir for files to process and adds them to a 
thread pool
- Thread pool starts processing these files to move them to "done" dir
- JHS scans "done_intermediate" again for files to process and adds them to a 
thread pool
-- If we have enough jobs where the thread pool can't keep up with the scanning 
interval, they'll get added twice (or more). If this keeps compounding,  jobs 
end up would pile up and not getting processed for quite some time and getting 
lots of FileNotFoundException's.

By default, it looks like the thread pool only has 3 threads in it 
(mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes 
(mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these?

  was:
- JHS scans "done_intermediate" dir for files to process and adds them to a 
thread pool
- Thread pool starts processing these files to move them to "done" dir
- JHS scans "done_intermediate" again for files to process and adds them to a 
thread pool
-- If we have enough jobs where the thread pool can't keep up with the scanning 
interval, they'll get added twice (or more). If this keeps compounding, I 
wouldn't be surprised if jobs end up piling up and not getting processed for 
quite some time and getting lots of FileNotFoundException's.

By default, it looks like the thread pool only has 3 threads in it 
(mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes 
(mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these?


> HistoryFileManager thrashing due to high volume jobs 
> -
>
> Key: MAPREDUCE-6858
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6858
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Yufei Gu
>
> - JHS scans "done_intermediate" dir for files to process and adds them to a 
> thread pool
> - Thread pool starts processing these files to move them to "done" dir
> - JHS scans "done_intermediate" again for files to process and adds them to a 
> thread pool
> -- If we have enough jobs where the thread pool can't keep up with the 
> scanning interval, they'll get added twice (or more). If this keeps 
> compounding,  jobs end up would pile up and not getting processed for quite 
> some time and getting lots of FileNotFoundException's.
> By default, it looks like the thread pool only has 3 threads in it 
> (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes 
> (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6858) HistoryFileManager thrashing due to high volume jobs

2017-03-07 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated MAPREDUCE-6858:

Description: 
The log of JHS shows that it tried to move the same *.jhist twice, and the 
second moving causes FileNotFoundException's.

- JHS scans "done_intermediate" dir for files to process and adds them to a 
thread pool
- Thread pool starts processing these files to move them to "done" dir
- JHS scans "done_intermediate" again for files to process and adds them to a 
thread pool
-- If we have enough jobs where the thread pool can't keep up with the scanning 
interval, they'll get added twice (or more). If this keeps compounding,  jobs 
end up would pile up and not getting processed for quite some time and getting 
lots of FileNotFoundException's.

By default, it looks like the thread pool only has 3 threads in it 
(mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes 
(mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these?

  was:
- JHS scans "done_intermediate" dir for files to process and adds them to a 
thread pool
- Thread pool starts processing these files to move them to "done" dir
- JHS scans "done_intermediate" again for files to process and adds them to a 
thread pool
-- If we have enough jobs where the thread pool can't keep up with the scanning 
interval, they'll get added twice (or more). If this keeps compounding,  jobs 
end up would pile up and not getting processed for quite some time and getting 
lots of FileNotFoundException's.

By default, it looks like the thread pool only has 3 threads in it 
(mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes 
(mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these?


> HistoryFileManager thrashing due to high volume jobs 
> -
>
> Key: MAPREDUCE-6858
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6858
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Yufei Gu
>
> The log of JHS shows that it tried to move the same *.jhist twice, and the 
> second moving causes FileNotFoundException's.
> - JHS scans "done_intermediate" dir for files to process and adds them to a 
> thread pool
> - Thread pool starts processing these files to move them to "done" dir
> - JHS scans "done_intermediate" again for files to process and adds them to a 
> thread pool
> -- If we have enough jobs where the thread pool can't keep up with the 
> scanning interval, they'll get added twice (or more). If this keeps 
> compounding,  jobs end up would pile up and not getting processed for quite 
> some time and getting lots of FileNotFoundException's.
> By default, it looks like the thread pool only has 3 threads in it 
> (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes 
> (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6858) HistoryFileManager thrashing due to high volume jobs

2017-03-07 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated MAPREDUCE-6858:

Description: 
JHS log shows that it tried to move the same *.jhist twice, and the second 
moving causes FileNotFoundException's.

- JHS scans "done_intermediate" dir for files to process and adds them to a 
thread pool
- Thread pool starts processing these files to move them to "done" dir
- JHS scans "done_intermediate" again for files to process and adds them to a 
thread pool
-- If we have enough jobs where the thread pool can't keep up with the scanning 
interval, they'll get added twice (or more). If this keeps compounding,  jobs 
end up would pile up and not getting processed for quite some time and getting 
lots of FileNotFoundException's.

By default, it looks like the thread pool only has 3 threads in it 
(mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes 
(mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these?

  was:
The log of JHS shows that it tried to move the same *.jhist twice, and the 
second moving causes FileNotFoundException's.

- JHS scans "done_intermediate" dir for files to process and adds them to a 
thread pool
- Thread pool starts processing these files to move them to "done" dir
- JHS scans "done_intermediate" again for files to process and adds them to a 
thread pool
-- If we have enough jobs where the thread pool can't keep up with the scanning 
interval, they'll get added twice (or more). If this keeps compounding,  jobs 
end up would pile up and not getting processed for quite some time and getting 
lots of FileNotFoundException's.

By default, it looks like the thread pool only has 3 threads in it 
(mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes 
(mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these?


> HistoryFileManager thrashing due to high volume jobs 
> -
>
> Key: MAPREDUCE-6858
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6858
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Yufei Gu
>
> JHS log shows that it tried to move the same *.jhist twice, and the second 
> moving causes FileNotFoundException's.
> - JHS scans "done_intermediate" dir for files to process and adds them to a 
> thread pool
> - Thread pool starts processing these files to move them to "done" dir
> - JHS scans "done_intermediate" again for files to process and adds them to a 
> thread pool
> -- If we have enough jobs where the thread pool can't keep up with the 
> scanning interval, they'll get added twice (or more). If this keeps 
> compounding,  jobs end up would pile up and not getting processed for quite 
> some time and getting lots of FileNotFoundException's.
> By default, it looks like the thread pool only has 3 threads in it 
> (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes 
> (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6858) HistoryFileManager thrashing due to high volume jobs

[jira] [Updated] (MAPREDUCE-6858) HistoryFileManager thrashing due to high volume jobs

[jira] [Updated] (MAPREDUCE-6858) HistoryFileManager thrashing due to high volume jobs

[jira] [Updated] (MAPREDUCE-6858) HistoryFileManager thrashing due to high volume jobs

4 matches

Site Navigation

Mail list logo

Footer information