[jira] [Created] (HIVE-26509) Introduce dynamic leader election in HMS
Zhihua Deng created HIVE-26509: -- Summary: Introduce dynamic leader election in HMS Key: HIVE-26509 URL: https://issues.apache.org/jira/browse/HIVE-26509 Project: Hive Issue Type: New Feature Components: Standalone Metastore Reporter: Zhihua Deng >From HIVE-21841 we have a leader HMS selected by configuring >metastore.housekeeping.leader.hostname on startup. This approach saves us from >running duplicated HMS's housekeeping tasks cluster-wide. In this jira, we introduce another dynamic leader election: adopt hive lock to implement the leader election. Once a HMS owns the lock, then it becomes the leader, carries out the housekeeping tasks, and sends heartbeats to renew the lock before timeout. If the leader fails to reclaim the lock, then stops the already started tasks if it has, the electing event is audited. We can achieve a more dynamic leader when the original goes down or in the public cloud without well configured property, and reduce the leader’s burdens by running these tasks among different leaders. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: FYI: MetaStore running out of threads
>> In hive, FileUtils.checkFileAccessWithImpersonation can be fixed to use create UGI once to reduce the impact (suspecting this will have 50% impact). Looked closely at the method impl for "FileUtils.checkFileAccessWithImpersonation". It doesn't make 2 connections; 50% impact may not be relevant here. On Thu, Sep 1, 2022 at 4:48 AM Rajesh Balamohan wrote: > > W.r.t to connection reuse issues, LLAP had a similar issue (not in HMS) > https://issues.apache.org/jira/browse/HIVE-16020. It was making a > connection in every task and UGI had to be persisted in the QueryInfo level > to reduce the impact. > > In hive, FileUtils.checkFileAccessWithImpersonation can be fixed to use > create UGI once to reduce the impact (suspecting this will have 50% > impact). > > > https://github.com/apache/hive/blob/master/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L418 > > https://github.com/apache/hive/blob/d06957f254e026e719f30027d161264be43386b0/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L461 > > May have to explore whether a local cache with expiry in FileUtils can > help reduce the impact further. > > ~Rajesh.B > > > On Thu, Sep 1, 2022 at 1:24 AM Owen O'Malley > wrote: > >> We're using HMS with Storage-Based Authorization and have been having >> trouble with the HMS running out of threads. Looking at the jstack & code, >> it appears to that the problem is that RPC's ConnectionId is using UGI's >> equal/hash, which uses the Subject's Object equals/hash. Proxy user UGI's >> always create a new Subject and thus are always unique. >> >> This leads to the HMS creating too many threads. I've created a jira in >> Hadoop. https://issues.apache.org/jira/browse/HADOOP-18434 >> >> Thanks, >>Owen >> >
Re: FYI: MetaStore running out of threads
W.r.t to connection reuse issues, LLAP had a similar issue (not in HMS) https://issues.apache.org/jira/browse/HIVE-16020. It was making a connection in every task and UGI had to be persisted in the QueryInfo level to reduce the impact. In hive, FileUtils.checkFileAccessWithImpersonation can be fixed to use create UGI once to reduce the impact (suspecting this will have 50% impact). https://github.com/apache/hive/blob/master/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L418 https://github.com/apache/hive/blob/d06957f254e026e719f30027d161264be43386b0/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L461 May have to explore whether a local cache with expiry in FileUtils can help reduce the impact further. ~Rajesh.B On Thu, Sep 1, 2022 at 1:24 AM Owen O'Malley wrote: > We're using HMS with Storage-Based Authorization and have been having > trouble with the HMS running out of threads. Looking at the jstack & code, > it appears to that the problem is that RPC's ConnectionId is using UGI's > equal/hash, which uses the Subject's Object equals/hash. Proxy user UGI's > always create a new Subject and thus are always unique. > > This leads to the HMS creating too many threads. I've created a jira in > Hadoop. https://issues.apache.org/jira/browse/HADOOP-18434 > > Thanks, >Owen >
FYI: MetaStore running out of threads
We're using HMS with Storage-Based Authorization and have been having trouble with the HMS running out of threads. Looking at the jstack & code, it appears to that the problem is that RPC's ConnectionId is using UGI's equal/hash, which uses the Subject's Object equals/hash. Proxy user UGI's always create a new Subject and thus are always unique. This leads to the HMS creating too many threads. I've created a jira in Hadoop. https://issues.apache.org/jira/browse/HADOOP-18434 Thanks, Owen
[jira] [Created] (HIVE-26508) Remove netty transitive dependencies from hcatalog and hbase pom files to avoid CVEs
Sai Hemanth Gantasala created HIVE-26508: Summary: Remove netty transitive dependencies from hcatalog and hbase pom files to avoid CVEs Key: HIVE-26508 URL: https://issues.apache.org/jira/browse/HIVE-26508 Project: Hive Issue Type: Bug Components: HBase Handler, HCatalog Affects Versions: 4.0.0-alpha-1, 4.0.0, 4.0.0-alpha-2 Reporter: Sai Hemanth Gantasala Assignee: Sai Hemanth Gantasala Remove netty transitive dependencies (coming from hadoop related dependencies) from hcatalog and hbase pom files to avoid CVEs -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-26507) Iceberg: In place metadata generation may not work for certain datatypes
Rajesh Balamohan created HIVE-26507: --- Summary: Iceberg: In place metadata generation may not work for certain datatypes Key: HIVE-26507 URL: https://issues.apache.org/jira/browse/HIVE-26507 Project: Hive Issue Type: Bug Reporter: Rajesh Balamohan "alter table" statements can be used for generating iceberg metadata information (i.e for converting external tables -> iceberg tables). As part of this process, it also converts certain datatypes to iceberg compatible types (e.g char -> string). "iceberg.mr.schema.auto.conversion" enables this conversion. This could cause certain issues at runtime. Here is an example {noformat} Before conversion: == -- external table select count(*) from customer_demographics where cd_gender = 'F' and cd_marital_status = 'U' and cd_education_status = '2 yr Degree'; 27440 after conversion: = -- iceberg table select count(*) from customer_demographics where cd_gender = 'F' and cd_marital_status = 'U' and cd_education_status = '2 yr Degree'; 0 select count(*) from customer_demographics where cd_gender = 'F' and cd_marital_status = 'U' and trim(cd_education_status) = '2 yr Degree'; 27440 {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)