[ 
https://issues.apache.org/jira/browse/KYLIN-4500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Arki updated KYLIN-4500:
------------------------------
    Description: 
h4. Environment
 * Kylin server 3.0.0
 * EMR 5.28

h4. Issue

After an extended uptime, both Kylin query server and jobs running on EMR stop 
working. The root cause is both cases is:
{noformat}
Caused by: java.io.IOException: 
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to 
execute HTTP request: Timeout waiting for connection from pool
        at 
com.amazon.ws.emr.hadoop.fs.s3n2.S3NativeFileSystem2.getFileStatus(S3NativeFileSystem2.java:257)
 ~[emrfs-hadoop-assembly-2.37.0.jar:?]{noformat}
{{Based on 
[https://aws.amazon.com/premiumsupport/knowledge-center/emr-timeout-connection-wait/]
 increasing the *fs.s3.maxConnections* setting to 10000 is just delaying the 
issue thus the underlying issue is likely a connection leak. It also indicates 
a leak that restarting the kylin service solves the problem.}}

{{A full stack trace from the QueryService is attached.}}

 

  was:
h4. Environment
 * Kylin server 3.0.0
 * EMR 5.28

h4. Issue

After an extended uptime, both Kylin query server and jobs running on EMR stop 
working. The root cause is both cases is:

{{Caused by: java.io.IOException: 
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to 
execute HTTP request: Timeout waiting for connection from pool}}
{{ at 
com.amazon.ws.emr.hadoop.fs.s3n2.S3NativeFileSystem2.getFileStatus(S3NativeFileSystem2.java:257)
 ~[emrfs-hadoop-assembly-2.37.0.jar:?]}}

{{Based on 
[https://aws.amazon.com/premiumsupport/knowledge-center/emr-timeout-connection-wait/]
 increasing the *fs.s3.maxConnections* setting to 10000 is just delaying the 
issue thus the underlying issue is likely a connection leak. It also indicates 
a leak that restarting the kylin service solves the problem.}}

{{A full stack trace from the QueryService is attached.}}

 


> Timeout waiting for connection from pool
> ----------------------------------------
>
>                 Key: KYLIN-4500
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4500
>             Project: Kylin
>          Issue Type: Bug
>            Reporter: Gabor Arki
>            Priority: Major
>         Attachments: kylin-connection-timeout.txt
>
>
> h4. Environment
>  * Kylin server 3.0.0
>  * EMR 5.28
> h4. Issue
> After an extended uptime, both Kylin query server and jobs running on EMR 
> stop working. The root cause is both cases is:
> {noformat}
> Caused by: java.io.IOException: 
> com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable 
> to execute HTTP request: Timeout waiting for connection from pool
>         at 
> com.amazon.ws.emr.hadoop.fs.s3n2.S3NativeFileSystem2.getFileStatus(S3NativeFileSystem2.java:257)
>  ~[emrfs-hadoop-assembly-2.37.0.jar:?]{noformat}
> {{Based on 
> [https://aws.amazon.com/premiumsupport/knowledge-center/emr-timeout-connection-wait/]
>  increasing the *fs.s3.maxConnections* setting to 10000 is just delaying the 
> issue thus the underlying issue is likely a connection leak. It also 
> indicates a leak that restarting the kylin service solves the problem.}}
> {{A full stack trace from the QueryService is attached.}}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to