[
https://issues.apache.org/jira/browse/TEZ-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13956029#comment-13956029
]
Rajesh Balamohan commented on TEZ-988:
--------------------------------------
For 20 node cluster, tez.runtime.shuffle.keep-alive.max.connections=20
For 100 node cluster, tez.runtime.shuffle.keep-alive.max.connections=50 would
be a good start (assuming a mix of small/medium/large jobs would be running in
the cluster and large job's map tasks run on 50% of the nodes).
> http.maxConnections needs to be configurable in Tez Fetcher & read from
> errorstream to make the connection reusable
> -------------------------------------------------------------------------------------------------------------------
>
> Key: TEZ-988
> URL: https://issues.apache.org/jira/browse/TEZ-988
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.4.0
> Reporter: Rajesh Balamohan
> Assignee: Rajesh Balamohan
> Attachments: TEZ-988-v1.patch, TEZ-988-v2.patch, TEZ-988-v3.patch,
> TEZ-988-v4.patch, TEZ-988-v5.patch
>
>
> 1. Currently http.maxConnections is set to 5 (default). Make this
> configurable in Fetcher.java. This will help in running larger queries
> 2. ErrorStream has to be read completely in order to make the connection
> reusable (when keepAlive is enabled). Currently, we do not read error stream.
--
This message was sent by Atlassian JIRA
(v6.2#6252)