[ https://issues.apache.org/jira/browse/HIVE-21206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sankar Hariappan updated HIVE-21206: ------------------------------------ Status: Patch Available (was: Open) > Bootstrap replication is slow as it opens lot of metastore connections. > ----------------------------------------------------------------------- > > Key: HIVE-21206 > URL: https://issues.apache.org/jira/browse/HIVE-21206 > Project: Hive > Issue Type: Bug > Components: repl > Affects Versions: 4.0.0 > Reporter: Sankar Hariappan > Assignee: Sankar Hariappan > Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-21206.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Hive bootstrap replication of 1TB data onprem to onprem in Hive3 is running > slower compared to Hive2. > Time taken for bootstrap replication of table with 1000 partitions are as > below: > || Hive2- Hive2 || Hive3 - Hive3 || > |Bootstrap: 7m| BootStrap: 17m | > Every MoveTask is closing and opening new metastore connection which is > causing slow down. > {code} > 2019-02-08T12:28:30,174 INFO [HiveServer2-Background-Pool: Thread-1134]: > ql.Driver (:()) - Starting task [Stage-5:MOVE] in serial mode > 2019-02-08T12:28:30,177 INFO [HiveServer2-Background-Pool: Thread-1134]: > exec.Task (:()) - Loading data to table nondefault.nondefault_table1 from > hdfs://mycluster1/warehouse/tablespace/managed/hive/nondefault.db/nondefault_table1/.hive-staging_hive_2019-02-08_12-28-23_584_1482331698286040936-3/-ext-10001 > 2019-02-08T12:28:30,189 INFO [HiveServer2-Background-Pool: Thread-1134]: > metastore.HiveMetaStoreClient (:()) - Trying to connect to metastore with URI > thrift://ctr-e139-1542663976389-62755-01-000014.hwx.site:9083 > 2019-02-08T12:28:30,189 INFO [HiveServer2-Background-Pool: Thread-1134]: > metastore.HiveMetaStoreClient (:()) - HMSC::open(): Could not find delegation > token. Creating KERBEROS-based thrift connection. > 2019-02-08T12:28:30,206 INFO [HiveServer2-Background-Pool: Thread-1134]: > metastore.HiveMetaStoreClient (:()) - Opened a connection to metastore, > current connections: 4 > 2019-02-08T12:28:30,206 INFO [HiveServer2-Background-Pool: Thread-1134]: > metastore.HiveMetaStoreClient (:()) - Connected to metastore. > 2019-02-08T12:28:30,206 INFO [HiveServer2-Background-Pool: Thread-1134]: > metastore.RetryingMetaStoreClient (:()) - RetryingMetaStoreClient proxy=class > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient > ugi=hive/ctr-e139-1542663976389-62755-01-000014.hwx.s...@hwqe.hortonworks.com > (auth:KERBEROS) retries=24 delay=5 lifetime=0 > 2019-02-08T12:28:30,325 INFO > [org.apache.ranger.audit.queue.AuditBatchQueue1]: provider.BaseAuditHandler > (:()) - Audit Status Log: name=hiveServer2.async.multi_dest.batch, > finalDestination=hiveServer2.async.multi_dest.batch.solr, interval=01:00.002 > minutes, events=2, succcessCount=1, totalEvents=56, totalSuccessCount=25 > 2019-02-08T12:28:30,520 INFO [HiveServer2-Background-Pool: Thread-1134]: > common.FileUtils (FileUtils.java:mkdir(580)) - Creating directory if it > doesn't exist: > hdfs://mycluster1/warehouse/tablespace/managed/hive/nondefault.db/nondefault_table1/base_0000001 > 2019-02-08T12:28:31,245 INFO [HiveServer2-Background-Pool: Thread-1134]: > ql.Driver (:()) - Starting task [Stage-11:MOVE] in serial mode > 2019-02-08T12:28:31,245 INFO [HiveServer2-Background-Pool: Thread-1134]: > metastore.HiveMetaStoreClient (:()) - Closed a connection to metastore, > current connections: 3 > 2019-02-08T12:28:31,246 INFO [HiveServer2-Background-Pool: Thread-1134]: > exec.Task (:()) - Loading data to table nondefault.nondefault_table2 from > hdfs://mycluster1/warehouse/tablespace/managed/hive/nondefault.db/nondefault_table2/.hive-staging_hive_2019-02-08_12-28-23_810_7457138692783022870-3/-ext-10002 > 2019-02-08T12:28:31,327 INFO [HiveServer2-Background-Pool: Thread-1134]: > metastore.HiveMetaStoreClient (:()) - Trying to connect to metastore with URI > thrift://ctr-e139-1542663976389-62755-01-000014.hwx.site:9083 > 2019-02-08T12:28:31,327 INFO [HiveServer2-Background-Pool: Thread-1134]: > metastore.HiveMetaStoreClient (:()) - HMSC::open(): Could not find delegation > token. Creating KERBEROS-based thrift connection. > 2019-02-08T12:28:31,336 INFO [HiveServer2-Background-Pool: Thread-1134]: > metastore.HiveMetaStoreClient (:()) - Opened a connection to metastore, > current connections: 4 > 2019-02-08T12:28:31,337 INFO [HiveServer2-Background-Pool: Thread-1134]: > metastore.HiveMetaStoreClient (:()) - Connected to metastore. > 2019-02-08T12:28:31,337 INFO [HiveServer2-Background-Pool: Thread-1134]: > metastore.RetryingMetaStoreClient (:()) - RetryingMetaStoreClient proxy=class > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient > ugi=hive/ctr-e139-1542663976389-62755-01-000014.hwx.s...@hwqe.hortonworks.com > (auth:KERBEROS) retries=24 delay=5 lifetime=0 > 2019-02-08T12:28:31,642 INFO [HiveServer2-Background-Pool: Thread-1134]: > common.FileUtils (FileUtils.java:mkdir(580)) - Creating directory if it > doesn't exist: > hdfs://mycluster1/warehouse/tablespace/managed/hive/nondefault.db/nondefault_table2/base_0000001 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)