[ https://issues.apache.org/jira/browse/TEZ-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357636#comment-15357636 ]
Sreenath Somarajapuram commented on TEZ-3318: --------------------------------------------- [~hitesh] When polling fails, we don't do a polling retry (From AM). Instead what we do is a page reload in double the time. i.e polling delay in 3sec, if RM is not reachable we do a page reload (From ATS) every 6 seconds until - 1. RM is reachable or 2. the application is complete. Considering that do we need this this retry limit? Adding the limit is a small change though. > Tez UI: Polling is not restarted after RM recovery > -------------------------------------------------- > > Key: TEZ-3318 > URL: https://issues.apache.org/jira/browse/TEZ-3318 > Project: Apache Tez > Issue Type: Bug > Reporter: Sreenath Somarajapuram > Assignee: Sreenath Somarajapuram > Attachments: TEZ-3318.1.patch > > > For a running DAG, we poll the AM to get progress and other realtime > information. This communication happens via RM. If RM goes down, even after > its recovery the polling is not re established. > Step to repro: > 1. Run a job > 2. Go to DAG details page, and ensure that the progress is getting updated. > 3. Stop RM, and ensure that error bar is getting displayed in the UI. > 4. Start RM. > 5. As soon as RM is online, the progress bar must get updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)