[ https://issues.apache.org/jira/browse/IMPALA-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17068623#comment-17068623 ]
ASF subversion and git services commented on IMPALA-4704: --------------------------------------------------------- Commit 1d63348b933b266f63d76b06eecbdf636cb45770 in impala's branch refs/heads/master from Joe McDonnell [ https://gitbox.apache.org/repos/asf?p=impala.git;h=1d63348 ] IMPALA-9549: Handle catalogd startup delays when using local catalog Impalads should be tolerant of delays in catalogd startup. Currently, when running with the local catalog (use_local_catalog=true), impalad startup can fail when catalogd startup is delayed. What happens is that ImpalaServer's constructor calls ImpalaServer::UpdateCatalogMetrics(), which maintains two metrics counting the number of tables and databases. This is before the code in ImpalaServer::Start() that waits for the catalogd to start (added by IMPALA-4704), so there is no guarantee that catalogd is running. The UpdateCatalogMetrics() call ends up calling getDbs() in the frontend catalog. LocalCatalog::getDbs() tries to load the databases (and thus contact catalogd), and this call will fail if catalogd is not running. This fails startup. use_local_catalog=false is immune to this only because it does not contact catalogd in Catalog::getDbs(). This moves the UpdateCatalogMetrics() call from the ImpalaServer constructor to ImpalaServer::Start() after the impalad has already waited for the catalogd to start up. It also limits the call to run only in coordinators. Prior to this change, when using local catalog, the executors would have catalog.num-databases and catalog.num-tables set to the right values at startup. These values would not be kept up to date. With this change, the executors do not have these values set. Without local catalog, both before and after this change, executors do not have accurate counts for catalog.num-databases or catalog.num-tables. Testing: - Added a test to custom_cluster.test_catalog_wait to delay catalogd start up by 60 seconds and verify that the impalads successfully start up. This test fails prior to this change. - Hand tested to verify that the metrics that are maintained by UpdateCatalogMetrics() are not meaningfully changed for coordinators and that executors do not have metrics set. Change-Id: I1b5a94c59faaaa25927a169dcb58f310ce6b1044 Reviewed-on: http://gerrit.cloudera.org:8080/15561 Reviewed-by: Vihang Karajgaonkar <vih...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > ImpalaD should not open 21000 and 21050 Ports till Catalog is Received > ---------------------------------------------------------------------- > > Key: IMPALA-4704 > URL: https://issues.apache.org/jira/browse/IMPALA-4704 > Project: IMPALA > Issue Type: Improvement > Components: Catalog, Frontend > Affects Versions: Impala 2.5.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, > Impala 2.9.0, Impala 2.10.0 > Reporter: Manish Maheshwari > Assignee: Vuk Ercegovac > Priority: Major > Labels: impala, ramp-up > Fix For: Impala 2.11.0 > > > Currently ImpalaD's open the frontend connections and this results in query > failures. The preferred behaviour would be that the ports remain closed till > the catalog is received and for any reason is the SS connectivity is not > established after reasonable attempts and timeouts, then the impalad to > simply shut down. > {code} > impalad.INFO:I1216 17:39:40.437333 10463 jni-util.cc:166] > com.cloudera.impala.common.AnalysisException: This Impala daemon is not ready > to accept user requests. Status: Waiting for catalog update from the > StateStore. > impalad.INFO:I1216 17:39:40.438743 10463 status.cc:112] AnalysisException: > This Impala daemon is not ready to accept user requests. Status: Waiting for > catalog update from the StateStore. > impalad.INFO:I1216 17:39:40.918184 10464 jni-util.cc:166] > com.cloudera.impala.common.AnalysisException: This Impala daemon is not ready > to accept user requests. Status: Waiting for catalog update from the > StateStore. > impalad.INFO:I1216 17:39:40.918994 10464 status.cc:112] AnalysisException: > This Impala daemon is not ready to accept user requests. Status: Waiting for > catalog update from the StateStore. > impalad.INFO:I1216 17:39:44.129482 10465 jni-util.cc:166] > com.cloudera.impala.common.AnalysisException: This Impala daemon is not ready > to accept user requests. Status: Waiting for catalog update from the > StateStore. > {code} > This will help especially when we have multiple impala'd behind a LB and > connections can be directed to daemons with the catalog when some > servers/impala services are been restarted for any reason. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org