[ 
https://issues.apache.org/jira/browse/TEZ-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877526#comment-15877526
 ] 

Rajesh Balamohan edited comment on TEZ-3633 at 2/22/17 5:46 AM:
----------------------------------------------------------------

I enabled keep-alive in tez and mapreduce shuffle handler. (i.e 
tez.runtime.shuffle.keep-alive.enabled=true and 
mapreduce.shuffle.connection-keep-alive.enable=true)

To verify keep-alive connections are working without connection breaks, I 
enabled wire logging in HttpUrlConnection as follows

Created the following file {{http.logger.properties}} and placed in {{/tmp/}} 
folder in test cluster

{noformat}
handlers=java.util.logging.ConsoleHandler
java.util.logging.ConsoleHandler.level = ALL
java.util.logging.ConsoleHandler.formatter = 
sun.net.www.protocol.http.HttpLogFormatter
sun.net.www.protocol.http.HttpURLConnection.level = ALL
sun.net.www.protocol.http.level=ALL
{noformat}

Tried out query-17 from TPC-DS with hive/tez which is relatively shuffle 
intensive in hive-1.  

{noformat}
hive --hiveconf 
tez.task.launch.cmd-opts="-Djava.util.logging.config.file=/tmp/http.logger.properties"
{noformat}

>From the logs, I see {{KeepAlive stream used}} which indicates that keepAlive 
>connections being used from sun's internal implementation. Hadoop version used 
>is 2.7.3 from HDP.

{noformat}
FINE: sun.net.www.MessageHeader@4f81a00c8 pairs: {GET 
/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_00_000000_0_10003&keepAlive=true
 HTTP/1.1: null}{UrlHash: rjk3o9nG2khcu5d+QGw1Gk1nhBY=}{name: 
mapreduce}{version: 1.0.0}{User-Agent: Java/1.8.0_121}{Host: 
ip-10-0-1-66.ec2.internal:13562}{Accept: text/html, image/gif, image/jpeg, *; 
q=.2, */*; q=.2}{Connection: keep-alive}
Feb 22, 2017 5:16:37 AM sun.net.www.http.HttpClient logFinest
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_00_000000_0_10003&keepAlive=true
Feb 22, 2017 5:16:37 AM sun.net.www.protocol.http.HttpURLConnection 
getInputStream0
FINE: sun.net.www.MessageHeader@4146578c7 pairs: {null: HTTP/1.1 200 
OK}{ReplyHash: wox3/ak3+1T3PwXc8eCk9C4wtDs=}{name: mapreduce}{version: 
1.0.0}{Content-Length: 1156}{Connection: Keep-Alive}{Keep-Alive: timeout=5}
Feb 22, 2017 5:16:38 AM sun.net.www.protocol.http.HttpURLConnection 
plainConnect0
...
FINEST: ProxySelector Request for 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000002_0_10006,attempt_1487556355205_0099_1_06_000003_0_10007,attempt_1487556355205_0099_1_06_000004_0_10006,attempt_1487556355205_0099_1_06_000001_0_10005,attempt_1487556355205_0099_1_06_000000_0_10005,attempt_1487556355205_0099_1_06_000007_0_10003,attempt_1487556355205_0099_1_06_000005_0_10002,attempt_1487556355205_0099_1_06_000006_0_10002,attempt_1487556355205_0099_1_06_000027_0_10023,attempt_1487556355205_0099_1_06_000033_0_10023,attempt_1487556355205_0099_1_06_000039_0_10026,attempt_1487556355205_0099_1_06_000044_0_10026,attempt_1487556355205_0099_1_06_000049_0_10026,attempt_1487556355205_0099_1_06_000038_0_10017,attempt_1487556355205_0099_1_06_000046_0_10017,attempt_1487556355205_0099_1_06_000048_0_10026&keepAlive=true
...
FINEST: KeepAlive stream retrieved from the cache, 
sun.net.www.http.HttpClient(http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_00_000000_0_10003&keepAlive=true)
Feb 22, 2017 5:16:38 AM sun.net.www.protocol.http.HttpURLConnection 
plainConnect0

........ printing more KeepAlive log lines alone from here onwards for 
reference..
...... 
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_00_000000_0_10003&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000003_0_10007,attempt_1487556355205_0099_1_06_000004_0_10006,attempt_1487556355205_0099_1_06_000001_0_10005,attempt_1487556355205_0099_1_06_000000_0_10005&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000007_0_10003,attempt_1487556355205_0099_1_06_000005_0_10002&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_04_000001_0_10010&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_04_000000_0_10010&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000002_0_10006&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000006_0_10002,attempt_1487556355205_0099_1_06_000027_0_10023,attempt_1487556355205_0099_1_06_000033_0_10023,attempt_1487556355205_0099_1_06_000039_0_10026&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_03_000000_0_10003&keepAlive=true
FINEST: KeepAlive stream retrieved from the cache, 
sun.net.www.http.HttpClient(http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_04_000001_0_10010&keepAlive=true)
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000044_0_10026&keepAlive=true
FINEST: KeepAlive stream retrieved from the cache, 
sun.net.www.http.HttpClient(http://ip-10-0-1-228.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_04_000021_0_10007,attempt_1487556355205_0099_1_04_000034_0_10007,attempt_1487556355205_0099_1_04_000037_0_10006,attempt_1487556355205_0099_1_04_000027_0_10006,attempt_1487556355205_0099_1_04_000033_0_10006,attempt_1487556355205_0099_1_04_000036_0_10007,attempt_1487556355205_0099_1_04_000031_0_10006,attempt_1487556355205_0099_1_04_000040_0_10006,attempt_1487556355205_0099_1_04_000030_0_10006&keepAlive=true)
FINEST: KeepAlive stream used: 
http://ip-10-0-1-228.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000034_0_10011&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_02_000000_0_10003&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_04_000043_0_10013,attempt_1487556355205_0099_1_04_000046_0_10022,attempt_1487556355205_0099_1_04_000049_0_10022,attempt_1487556355205_0099_1_04_000047_0_10013,attempt_1487556355205_0099_1_04_000048_0_10021&keepAlive=true
FINEST: KeepAlive stream retrieved from the cache, 
sun.net.www.http.HttpClient(http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_04_000000_0_10010&keepAlive=true)
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000049_0_10026&keepAlive=true
FINEST: KeepAlive stream retrieved from the cache, 
sun.net.www.http.HttpClient(http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000044_0_10026&keepAlive=true)
{noformat}




was (Author: rajesh.balamohan):

I enabled keep-alive in tez and mapreduce shuffle handler. (i.e 
tez.runtime.shuffle.keep-alive.enabled=true and 
mapreduce.shuffle.connection-keep-alive.enable=true)

To verify keep-alive connections are working without connection breaks, I 
enabled wire logging in HttpUrlConnection as follows

Created the following file {{http.logger.properties}} and placed in {{/tmp/}} 
folder in test cluster

{noformat}
handlers=java.util.logging.ConsoleHandler
java.util.logging.ConsoleHandler.level = ALL
java.util.logging.ConsoleHandler.formatter = 
sun.net.www.protocol.http.HttpLogFormatter
sun.net.www.protocol.http.HttpURLConnection.level = ALL
sun.net.www.protocol.http.level=ALL
{noformat}

Tried out query-17 from TPC-DS with hive/tez which is relatively shuffle 
intensive in hive-1.  

{noformat}
hive --hiveconf 
tez.task.launch.cmd-opts="-Djava.util.logging.config.file=/tmp/http.logger.properties"
{noformat}

>From the logs, I see logs of {{KeepAlive stream used}} which does indicate 
>that keepAlive connections being used from sun's internal implementation. 
>Hadoop version used is 2.7.3 from HDP.

{noformat}
FINE: sun.net.www.MessageHeader@4f81a00c8 pairs: {GET 
/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_00_000000_0_10003&keepAlive=true
 HTTP/1.1: null}{UrlHash: rjk3o9nG2khcu5d+QGw1Gk1nhBY=}{name: 
mapreduce}{version: 1.0.0}{User-Agent: Java/1.8.0_121}{Host: 
ip-10-0-1-66.ec2.internal:13562}{Accept: text/html, image/gif, image/jpeg, *; 
q=.2, */*; q=.2}{Connection: keep-alive}
Feb 22, 2017 5:16:37 AM sun.net.www.http.HttpClient logFinest
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_00_000000_0_10003&keepAlive=true
Feb 22, 2017 5:16:37 AM sun.net.www.protocol.http.HttpURLConnection 
getInputStream0
FINE: sun.net.www.MessageHeader@4146578c7 pairs: {null: HTTP/1.1 200 
OK}{ReplyHash: wox3/ak3+1T3PwXc8eCk9C4wtDs=}{name: mapreduce}{version: 
1.0.0}{Content-Length: 1156}{Connection: Keep-Alive}{Keep-Alive: timeout=5}
Feb 22, 2017 5:16:38 AM sun.net.www.protocol.http.HttpURLConnection 
plainConnect0
...
FINEST: ProxySelector Request for 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000002_0_10006,attempt_1487556355205_0099_1_06_000003_0_10007,attempt_1487556355205_0099_1_06_000004_0_10006,attempt_1487556355205_0099_1_06_000001_0_10005,attempt_1487556355205_0099_1_06_000000_0_10005,attempt_1487556355205_0099_1_06_000007_0_10003,attempt_1487556355205_0099_1_06_000005_0_10002,attempt_1487556355205_0099_1_06_000006_0_10002,attempt_1487556355205_0099_1_06_000027_0_10023,attempt_1487556355205_0099_1_06_000033_0_10023,attempt_1487556355205_0099_1_06_000039_0_10026,attempt_1487556355205_0099_1_06_000044_0_10026,attempt_1487556355205_0099_1_06_000049_0_10026,attempt_1487556355205_0099_1_06_000038_0_10017,attempt_1487556355205_0099_1_06_000046_0_10017,attempt_1487556355205_0099_1_06_000048_0_10026&keepAlive=true
...
FINEST: KeepAlive stream retrieved from the cache, 
sun.net.www.http.HttpClient(http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_00_000000_0_10003&keepAlive=true)
Feb 22, 2017 5:16:38 AM sun.net.www.protocol.http.HttpURLConnection 
plainConnect0

........ printing more KeepAlive log lines alone from here onwards for 
reference..
...... 
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_00_000000_0_10003&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000003_0_10007,attempt_1487556355205_0099_1_06_000004_0_10006,attempt_1487556355205_0099_1_06_000001_0_10005,attempt_1487556355205_0099_1_06_000000_0_10005&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000007_0_10003,attempt_1487556355205_0099_1_06_000005_0_10002&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_04_000001_0_10010&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_04_000000_0_10010&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000002_0_10006&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000006_0_10002,attempt_1487556355205_0099_1_06_000027_0_10023,attempt_1487556355205_0099_1_06_000033_0_10023,attempt_1487556355205_0099_1_06_000039_0_10026&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_03_000000_0_10003&keepAlive=true
FINEST: KeepAlive stream retrieved from the cache, 
sun.net.www.http.HttpClient(http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_04_000001_0_10010&keepAlive=true)
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000044_0_10026&keepAlive=true
FINEST: KeepAlive stream retrieved from the cache, 
sun.net.www.http.HttpClient(http://ip-10-0-1-228.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_04_000021_0_10007,attempt_1487556355205_0099_1_04_000034_0_10007,attempt_1487556355205_0099_1_04_000037_0_10006,attempt_1487556355205_0099_1_04_000027_0_10006,attempt_1487556355205_0099_1_04_000033_0_10006,attempt_1487556355205_0099_1_04_000036_0_10007,attempt_1487556355205_0099_1_04_000031_0_10006,attempt_1487556355205_0099_1_04_000040_0_10006,attempt_1487556355205_0099_1_04_000030_0_10006&keepAlive=true)
FINEST: KeepAlive stream used: 
http://ip-10-0-1-228.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000034_0_10011&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_02_000000_0_10003&keepAlive=true
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_04_000043_0_10013,attempt_1487556355205_0099_1_04_000046_0_10022,attempt_1487556355205_0099_1_04_000049_0_10022,attempt_1487556355205_0099_1_04_000047_0_10013,attempt_1487556355205_0099_1_04_000048_0_10021&keepAlive=true
FINEST: KeepAlive stream retrieved from the cache, 
sun.net.www.http.HttpClient(http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_04_000000_0_10010&keepAlive=true)
FINEST: KeepAlive stream used: 
http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000049_0_10026&keepAlive=true
FINEST: KeepAlive stream retrieved from the cache, 
sun.net.www.http.HttpClient(http://ip-10-0-1-66.ec2.internal:13562/mapOutput?job=job_1487556355205_0099&reduce=0&map=attempt_1487556355205_0099_1_06_000044_0_10026&keepAlive=true)
{noformat}



> Implement keep-alive timeout in tez shuffle handler
> ---------------------------------------------------
>
>                 Key: TEZ-3633
>                 URL: https://issues.apache.org/jira/browse/TEZ-3633
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>         Attachments: TEZ-3633.1.patch
>
>
> MAPREDUCE-5787 which added keep-alive to mapreduce shuffle handler was not 
> fully functional as despite advertising keep-alive option and adding the  
> header to the response, all connections were closed immediately after write. 
> This reduced the performance of certain fetches as now time is spent 
> requesting a second get to the same serve, only for that server to reset the 
> connection forcing the client to reestablish the connection on another port. 
> The details of this is hidden behind HttpURLConnection and doesn't show in 
> any log file at default logging level. However TCP sniffing does show errant 
> behavior.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to