[
https://issues.apache.org/jira/browse/IMPALA-11729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Csaba Ringhofer updated IMPALA-11729:
-------------------------------------
Description:
impalad startup takes several seconds, even few seconds before trying
connecting to statestored. From a test run (release mode) with a parallel
catalogd startup:
{code}
I1113 21:02:17.334743 4363 logging.cc:247] stdout will be logged to this file.
I1113 21:02:18.968991 4363 JniFrontend.java:141] Java Input arguments:
I1113 21:02:19.887519 4363 exec-env.cc:467] Starting statestore subscriber
service
{code}
After connecting to statestore coordinators need to wait for the initial
catalog update and processing it will take time depending on the number of
catalog objects:
{code}
I1113 21:02:19.888423 4363 Frontend.java:1618] Waiting for local catalog to be
initialized, attempt: 0
I1113 21:02:21.888621 4363 Frontend.java:1618] Waiting for local catalog to be
initialized, attempt: 1
I1113 21:02:23.888849 4363 Frontend.java:1614] Local catalog initialized
after: 4000 ms.
I1113 21:02:23.890105 4363 impala-server.cc:3103] Impala has started.
{code}
Meanwhile on catalogd it takes 2 seconds before even trying to connect to HMS:
{code}
I1113 21:02:17.289606 4281 logging.cc:247] stdout will be logged to this file.
I1113 21:02:19.023339 4281 HiveMetaStoreClient.java:720] Trying to connect to
metastore with URI (thrift://localhost:9083) in binary transport mode
I1113 21:02:21.671665 5028 catalog-server.cc:400] A catalog update with 1647
entries is assembled. Catalog version: 1649 Last sent catalog version: 0
{code}
While this 6 secs at impalad with ~2 secs waiting for initial catalog update is
not very bad, making it quicker would be visible in test run times (custom
cluster tests restart the cluster a lot) and in autoscaling scenarios. Finding
out what takes the time during startup would be also nice ramp up task.
The startup logic is single threaded - I see the most potential in moving some
independent tasks to separate threads. It is also possible that we are doing
some completely unnecessary tasks in some scenarios (e..g executor only
impalad) or that some tasks could be safely moved to a later point when they
are actually needed.
Initialization is driven mainly from here:
https://github.com/apache/impala/blob/master/be/src/service/impalad-main.cc
https://github.com/apache/impala/blob/master/be/src/catalog/catalogd-main.cc
but probably most of time is spend in Java code
was:
impalad startup takes several seconds, even few seconds before trying
connecting to statestored. From a test run (release mode) with a parallel
catalogd startup:
{code}
I1113 21:02:17.334743 4363 logging.cc:247] stdout will be logged to this file.
I1113 21:02:18.968991 4363 JniFrontend.java:141] Java Input arguments:
I1113 21:02:19.887519 4363 exec-env.cc:467] Starting statestore subscriber
service
{code}
After connecting to statestore coordinators need to wait for the initial
catalog update and processing it will take time depending on the number of
catalog objects:
{code}
I1113 21:02:19.888423 4363 Frontend.java:1618] Waiting for local catalog to be
initialized, attempt: 0
I1113 21:02:21.888621 4363 Frontend.java:1618] Waiting for local catalog to be
initialized, attempt: 1
I1113 21:02:23.888849 4363 Frontend.java:1614] Local catalog initialized
after: 4000 ms.
I1113 21:02:23.890105 4363 impala-server.cc:3103] Impala has started.
{code}
Meanwhile on catalogd it takes 2 seconds before even trying to connect to HMS:
{code}
I1113 21:02:17.289606 4281 logging.cc:247] stdout will be logged to this file.
I1113 21:02:19.023339 4281 HiveMetaStoreClient.java:720] Trying to connect to
metastore with URI (thrift://localhost:9083) in binary transport mode
I1113 21:02:21.671665 5028 catalog-server.cc:400] A catalog update with 1647
entries is assembled. Catalog version: 1649 Last sent catalog version: 0
{code}
While this 6 secs at impalad with ~2 secs waiting for initial catalog update is
not very bad, making it quicker would be visible in test run times (custom
cluster tests restart the cluster a lot) and in autoscaling scenarios. Finding
out what takes the time during startup would be also nice ramp up task.
The startup logic is single threaded - I see the most potential in moving some
independent tasks to separate threads. It is also possible that we are doing
some completely unnecessary tasks in some scenarios (e..g executor only
impalad) or that some tasks could be safely moved to a later point when they
are actually needed,
> Investigate and improve impalad startup time
> --------------------------------------------
>
> Key: IMPALA-11729
> URL: https://issues.apache.org/jira/browse/IMPALA-11729
> Project: IMPALA
> Issue Type: Improvement
> Reporter: Csaba Ringhofer
> Priority: Minor
> Labels: ramp-up
>
> impalad startup takes several seconds, even few seconds before trying
> connecting to statestored. From a test run (release mode) with a parallel
> catalogd startup:
> {code}
> I1113 21:02:17.334743 4363 logging.cc:247] stdout will be logged to this
> file.
> I1113 21:02:18.968991 4363 JniFrontend.java:141] Java Input arguments:
> I1113 21:02:19.887519 4363 exec-env.cc:467] Starting statestore subscriber
> service
> {code}
> After connecting to statestore coordinators need to wait for the initial
> catalog update and processing it will take time depending on the number of
> catalog objects:
> {code}
> I1113 21:02:19.888423 4363 Frontend.java:1618] Waiting for local catalog to
> be initialized, attempt: 0
> I1113 21:02:21.888621 4363 Frontend.java:1618] Waiting for local catalog to
> be initialized, attempt: 1
> I1113 21:02:23.888849 4363 Frontend.java:1614] Local catalog initialized
> after: 4000 ms.
> I1113 21:02:23.890105 4363 impala-server.cc:3103] Impala has started.
> {code}
> Meanwhile on catalogd it takes 2 seconds before even trying to connect to HMS:
> {code}
> I1113 21:02:17.289606 4281 logging.cc:247] stdout will be logged to this
> file.
> I1113 21:02:19.023339 4281 HiveMetaStoreClient.java:720] Trying to connect
> to metastore with URI (thrift://localhost:9083) in binary transport mode
> I1113 21:02:21.671665 5028 catalog-server.cc:400] A catalog update with 1647
> entries is assembled. Catalog version: 1649 Last sent catalog version: 0
> {code}
> While this 6 secs at impalad with ~2 secs waiting for initial catalog update
> is not very bad, making it quicker would be visible in test run times (custom
> cluster tests restart the cluster a lot) and in autoscaling scenarios.
> Finding out what takes the time during startup would be also nice ramp up
> task.
> The startup logic is single threaded - I see the most potential in moving
> some independent tasks to separate threads. It is also possible that we are
> doing some completely unnecessary tasks in some scenarios (e..g executor only
> impalad) or that some tasks could be safely moved to a later point when they
> are actually needed.
> Initialization is driven mainly from here:
> https://github.com/apache/impala/blob/master/be/src/service/impalad-main.cc
> https://github.com/apache/impala/blob/master/be/src/catalog/catalogd-main.cc
> but probably most of time is spend in Java code
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]