[jira] [Commented] (HADOOP-16618) increase the default number of threads and http connections in S3A
[ https://issues.apache.org/jira/browse/HADOOP-16618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17476818#comment-17476818 ] Steve Loughran commented on HADOOP-16618: - * there's a fixed thread pool and an unlimited pool, both for doing work on behalf of calling threads * all http io which takes place in blocking calls is in the threads calling in. if you have a hive or spark worker process with 128 threads, it'll be using at least that many when reading data; when writing blocks can be queued for async upload job commit in the s3a committer is another many-thread operation > increase the default number of threads and http connections in S3A > -- > > Key: HADOOP-16618 > URL: https://issues.apache.org/jira/browse/HADOOP-16618 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.2.1 >Reporter: Steve Loughran >Priority: Major > > Enable bigger thread and http pools in the S3A connector, especially now that > the transfer manager is doing parallel block transfer, as is rename() > We can make a lot more with parallelism in a single thread, and for > applications with multiple worker threads, we need bigger defaults -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16618) increase the default number of threads and http connections in S3A
[ https://issues.apache.org/jira/browse/HADOOP-16618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17476069#comment-17476069 ] Ahmar Suhail commented on HADOOP-16618: --- I started looking at this ticket and the default values are currently: DEFAULT_MAXIMUM_CONNECTIONS = 96 DEFAULT_MAX_THREADS = 10 DEFAULT_SOCKET_SEND_BUFFER = 8 * 1024 I was wondering if you had any ideas for what the new numbers should be or what we can do to find these numbers? I tried tweaking the values and ran a couple of performance tests, see [concurrent renames|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/ITestS3AConcurrentOps.java#L164] and [read file|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/ITestS3AInputStreamPerformance.java#L244], but I don't know if that gives us enough data. Also wanted to understand what advantage having a connection pool of size 96 gives? From what I understand, if we have max_threads to 10, we'll only ever use a max of 10 connections? > increase the default number of threads and http connections in S3A > -- > > Key: HADOOP-16618 > URL: https://issues.apache.org/jira/browse/HADOOP-16618 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.2.1 >Reporter: Steve Loughran >Priority: Major > > Enable bigger thread and http pools in the S3A connector, especially now that > the transfer manager is doing parallel block transfer, as is rename() > We can make a lot more with parallelism in a single thread, and for > applications with multiple worker threads, we need bigger defaults -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16618) increase the default number of threads and http connections in S3A
[ https://issues.apache.org/jira/browse/HADOOP-16618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17259830#comment-17259830 ] Steve Loughran commented on HADOOP-16618: - also: socket buffer sizes > increase the default number of threads and http connections in S3A > -- > > Key: HADOOP-16618 > URL: https://issues.apache.org/jira/browse/HADOOP-16618 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.2.1 >Reporter: Steve Loughran >Priority: Major > > Enable bigger thread and http pools in the S3A connector, especially now that > the transfer manager is doing parallel block transfer, as is rename() > We can make a lot more with parallelism in a single thread, and for > applications with multiple worker threads, we need bigger defaults -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org