[
https://issues.apache.org/jira/browse/HADOOP-18650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran resolved HADOOP-18650.
-------------------------------------
Resolution: Won't Fix
Don't think this is worth the effort
> improve s3a committer stats collected
> -------------------------------------
>
> Key: HADOOP-18650
> URL: https://issues.apache.org/jira/browse/HADOOP-18650
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.3.5
> Reporter: Steve Loughran
> Priority: Major
>
> we can improve stats collected in the s3a committer and saved to the JSON.
> key ones
> # of task manifests read; duration of loads
> # size of each manifest
> I think we would also benefit if we could set the commit thread pools to be
> big -but then shared across all jobs (i.e. demand-created thread pool in s3a
> fs). that would allow for a pool size of say, 500, but still support many
> jobs actively committing at same time (busy spark driver)
> finally: should file commit pool size be > size of pool of manifest readers.
> I think it could be, but the ratio should be fairly low.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]