Anthony Young-Garner created SENTRY-2556:
--------------------------------------------
Summary: Provide prefer local option to improve performance when
Hive on S3 is used conjunction with Sentry HA and Sentry-HDFS sync
Key: SENTRY-2556
URL: https://issues.apache.org/jira/browse/SENTRY-2556
Project: Sentry
Issue Type: Improvement
Components: Sentry
Affects Versions: 2.0.0
Reporter: Anthony Young-Garner
Performance degradation occurs when 1) the Hive Metastore Server is connected
(via Sentry client) to a remote Sentry Server and 2) the HiveServer2 is
connected (via Sentry client) to a local Sentry Server and when Hive on S3 is
used conjunction with Sentry HA and Sentry-HDFS sync.
TO REPRODUCE:
# Setup Sentry HA with HDFS sync
# Configure Hive and HDFS to use S3
# Create an external table in s3
EXAMPLE: CREATE EXTERNAL TABLE mytesttable (firstname STRING, lastname STRING,
address STRING, city STRING, state STRING, zip int) ROW FORMAT DELIMITED FIELDS
TERMINATED BY ',' LOCATION 's3a://ajy-sentry/';
RESULT: Creating a table in s3 can take a very long time (two orders of
magnitude slower than table creation in HDFS). Note that it won't always occur
(see below for more detail
To force a test system into the condition that causes the performance
degradation:
# For each HiveServer2 instance, setting the
sentry.service.client.server.rpc-addresses property to one value (local to the
HiveServer2 instance) and then restarting that HiveServer2 instance
# For each HMS instance, setting the
sentry.service.client.server.rpc-addresses property to one value (remote to the
HMS instance) and then restarting that HMS instance
-------------
I think the needed code change would be to provide a _prefer local_ option on
the SentryTransportPool and/or the SentryGenericServiceClientDefaultImpl so
that when the HMS is on the same node as one of the Sentry servers, that the
local Sentry server is used. Testing would need to be performed to determine
whether this should become normal behavior or should be user-configurable for
specific situations
--
This message was sent by Atlassian Jira
(v8.3.4#803005)