[ 
https://issues.apache.org/jira/browse/TEZ-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357636#comment-15357636
 ] 

Sreenath Somarajapuram commented on TEZ-3318:
---------------------------------------------

[~hitesh]
When polling fails, we don't do a polling retry (From AM). Instead what we do 
is a page reload in double the time. i.e polling delay in 3sec, if RM is not 
reachable we do a page reload (From ATS) every 6 seconds until - 1. RM is 
reachable or 2. the application is complete.

Considering that do we need this this retry limit? Adding the limit is a small 
change though.


> Tez UI: Polling is not restarted after RM recovery
> --------------------------------------------------
>
>                 Key: TEZ-3318
>                 URL: https://issues.apache.org/jira/browse/TEZ-3318
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Sreenath Somarajapuram
>            Assignee: Sreenath Somarajapuram
>         Attachments: TEZ-3318.1.patch
>
>
> For a running DAG, we poll the AM to get progress and other realtime 
> information. This communication happens via RM. If RM goes down, even after 
> its recovery the polling is not re established.
> Step to repro:
> 1. Run a job
> 2. Go to DAG details page, and ensure that the progress is getting updated.
> 3. Stop RM, and ensure that error bar is getting displayed in the UI.
> 4. Start RM.
> 5. As soon as RM is online, the progress bar must get updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to