[ 
https://issues.apache.org/jira/browse/YARN-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008733#comment-17008733
 ] 

Adam Antal edited comment on YARN-10068 at 1/6/20 11:09 AM:
------------------------------------------------------------

Hi [~anand.srinivasan],

I have some comments on the patch.
1. {code:java}
        LOG.warn("Error closing the HTTP response's inputstream.", che);

{code}
Can we make this ERROR level, since it's causing serious issues.
2. In the else-branch, regardless if we succeed in this part:
{code:java}
                String stringType = resp.getEntity(String.class);
                msg = "Server response:\n" + stringType;
{code}
We will override the msg {{String}} in the finally part. I suggest to use 
{code:java}
msg +=
{code}
instead of simple "=", so we will have all the information in the message.
3. Catching only the {{ClientHandlerException}} and 
{{UniformInterfaceException}} types are a bit concerning. In case of any 
unchecked exceptions, since we throw a YarnException at the end of the finally 
block, those are not going to have any trace. I suggest to add a {{Throwable}} 
case - something like this:
{code:java}
   ...
} catch (ClientHandlerException | UniformInterfaceException chuie) {
   msg = "Error getting entity from the HTTP response." + 
chuie.getLocalizedMessage();
} catch (Throwable t) {
   msg = "Error happened during getting server response: " + 
t.getLocalizedMessage();
} finally {
   ...
{code}


was (Author: adam.antal):
Hi [~anand.srinivasan],

I have some comments on the patch.
1. {code:java}
        LOG.warn("Error closing the HTTP response's inputstream.", che);

{code}
Can we make this ERROR level, since it's causing serious issues.
2. In the else-branch, regardless if we succeed in this part:
{code:java}
                String stringType = resp.getEntity(String.class);
                msg = "Server response:\n" + stringType;
{code}
We will override the msg {{String}} in the finally part. I suggest to use 
{code:java}
msg *+*=
{code}
instead of simple "=", so we will have all the information in the message.
3. Catching only the {{ClientHandlerException}} and 
{{UniformInterfaceException}} types are a bit concerning. In case of any 
unchecked exceptions, since we throw a YarnException at the end of the finally 
block, those are not going to have any trace. I suggest to add a {{Throwable}} 
case - something like this:
{code:java}
   ...
} catch (ClientHandlerException | UniformInterfaceException chuie) {
   msg = "Error getting entity from the HTTP response." + 
chuie.getLocalizedMessage();
} catch (Throwable t) {
   msg = "Error happened during getting server response: " + 
t.getLocalizedMessage();
} finally {
   ...
{code}

> TimelineV2Client may leak file descriptors creating ClientResponse objects.
> ---------------------------------------------------------------------------
>
>                 Key: YARN-10068
>                 URL: https://issues.apache.org/jira/browse/YARN-10068
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: ATSv2
>    Affects Versions: 3.0.0
>         Environment: HDP VERSION3.1.4
> AMBARI VERSION2.7.4.0
>            Reporter: Anand Srinivasan
>            Assignee: Anand Srinivasan
>            Priority: Critical
>         Attachments: YARN-10068.001.patch, YARN-10068.002.patch, 
> image-2020-01-02-14-58-12-773.png
>
>
> Hi team,
> Code-walkthrough between v1 and v2 of TimelineClient API revealed that v2 API 
> TimelineV2ClientImpl#putObjects doesn't close ClientResponse objects under 
> success status returned from Timeline Server. ClientResponse is closed only 
> under erroneous response from the server using ClientResponse#getEntity.
> We also noticed that TimelineClient (v1) closes the ClientResponse object in 
> TimelineWriter#putEntities by calling ClientResponse#getEntity in both 
> success and error conditions from the server thereby avoiding this file 
> descriptor leak.
> Customer's original issue and the symptom was that the NodeManager went down 
> because of 'too many files open' condition where there were lots of 
> CLOSED_WAIT sockets observed between the timeline client (from NM) and the 
> timeline server hosts. 
> Could you please help resolve this issue ? Thanks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to