Joe McDonnell created IMPALA-9549:
-------------------------------------

             Summary: Impalad startup fails to wait for catalogd to startup 
when using local catalog
                 Key: IMPALA-9549
                 URL: https://issues.apache.org/jira/browse/IMPALA-9549
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 4.0
            Reporter: Joe McDonnell


Since Impala coordinators and executors may be starting up at the same time as 
the catalogd, they should be tolerant of delays in the catalogd starting up. 
When using local catalog (use_local_catalog=true), the Impalads fail with the 
following error if the catalogd startup is delayed:
{noformat}
I0323 14:22:03.151849 29565 jni-util.cc:288] 
org.apache.impala.catalog.local.LocalCatalogException: Unable to load database 
names
I0323 14:22:03.151849 29565 jni-util.cc:288] 
org.apache.impala.catalog.local.LocalCatalogException: Unable to load database 
names
 at org.apache.impala.catalog.local.LocalCatalog.loadDbs(LocalCatalog.java:94)
 at org.apache.impala.catalog.local.LocalCatalog.getDbs(LocalCatalog.java:83)
 at org.apache.impala.service.Frontend.getCatalogMetrics(Frontend.java:753)
 at 
org.apache.impala.service.JniFrontend.getCatalogMetrics(JniFrontend.java:220)
Caused by: org.apache.thrift.TException: 
org.apache.impala.common.InternalException: Couldn't open transport for 
localhost:26000 (connect() failed: Connection refused)

 at 
org.apache.impala.catalog.local.CatalogdMetaProvider.sendRequest(CatalogdMetaProvider.java:382)
 at 
org.apache.impala.catalog.local.CatalogdMetaProvider.access$100(CatalogdMetaProvider.java:174)
 at 
org.apache.impala.catalog.local.CatalogdMetaProvider$1.call(CatalogdMetaProvider.java:583)
 at 
org.apache.impala.catalog.local.CatalogdMetaProvider$1.call(CatalogdMetaProvider.java:578)
 at 
org.apache.impala.catalog.local.CatalogdMetaProvider.loadWithCaching(CatalogdMetaProvider.java:509)
 at 
org.apache.impala.catalog.local.CatalogdMetaProvider.loadDbList(CatalogdMetaProvider.java:577)
 at org.apache.impala.catalog.local.LocalCatalog.loadDbs(LocalCatalog.java:92)
 ... 3 more
Caused by: org.apache.impala.common.InternalException: Couldn't open transport 
for localhost:26000 (connect() failed: Connection refused)
 at org.apache.impala.service.FeSupport.NativeGetPartialCatalogObject(Native 
Method)
 at 
org.apache.impala.service.FeSupport.GetPartialCatalogObject(FeSupport.java:440)
 at 
org.apache.impala.catalog.local.CatalogdMetaProvider.sendRequest(CatalogdMetaProvider.java:380)
 ... 9 more
I0323 14:22:03.217051 29565 status.cc:126] LocalCatalogException: Unable to 
load database names
CAUSED BY: TException: org.apache.impala.common.InternalException: Couldn't 
open transport for localhost:26000 (connect() failed: Connection 
refused){noformat}
What happens is that the ImpalaServer constructor calls 
ImpalaServer::UpdateCatalogMetrics() 
([https://github.com/apache/impala/blob/3b833902519fb8f0ef9b5fd20919c5fd85d22fcf/be/src/service/impala-server.cc#L452]
 ). UpdateCatalogMetrics() is maintaining two metrics that track the number of 
databases and the number of tables. This ends up calling 
org.apache.impala.catalog.local.LocalCatalog.getDbs(), which calls loadDbs() 
([https://github.com/apache/impala/blob/ca0785ec206f27f06d8d6fd1b710779e548bbd8e/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java#L83]
 ). loadDbs() requires a connection to catalogd and will fail if it cannot 
connect.

Importantly, this all happens before waiting for the catalogd to start up in 
the regular ImpalaServer::Start():
{code:java}
if (FLAGS_is_coordinator) exec_env_->frontend()->WaitForCatalog();
{code}
 

In the old catalog implementation (use_local_catalog=false), the getDbs() call 
on the catalog returns whatever values it has, and it does not try to contact 
the catalogd. This is why the regular case does not see this problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to