László Végh created HIVE-26966:
----------------------------------
Summary: Hive is unable to delete Azure storage objects
Key: HIVE-26966
URL: https://issues.apache.org/jira/browse/HIVE-26966
Project: Hive
Issue Type: Improvement
Reporter: László Végh
While writing data on cloud hive uses the expected RAZ authenticated way (using
the access by Managed Identity), HiveProtoEventsCleanerTask is following a
different approach, and tries to delete the data using the directory owner,
which may not available in Ranger.
To solve this issue either
* investigate how authentication works for data writing and implement it for
deletion as well (preferred solution)
* or introduce a new configuration value holding the name of the user who
needs to be used for deleting the data.
related hadoop logs:
{code:java}
2022-12-07 11:30:07,163 WARN
org.apache.hadoop.security.ShellBasedUnixGroupsMapping: [pool-310888-thread-7]:
unable to return groups for user 9ffea8fa-dec1-49ea-bb45-72bcb43951e8
org.apache.hadoop.security.ShellBasedUnixGroupsMapping$PartialGroupNameException:
The user name '9ffea8fa-dec1-49ea-bb45-72bcb43951e8' is not found. id:
9ffea8fa-dec1-49ea-bb45-72bcb43951e8: no such user
id: 9ffea8fa-dec1-49ea-bb45-72bcb43951e8: no such user
2022-12-07 11:30:07,164 ERROR
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore: [pool-310888-thread-7]:
Failed to get primary group for 9ffea8fa-dec1-49ea-bb45-72bcb43951e8, using
user name as primary group name
2022-12-07 11:30:07,231 INFO
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager:
[TezSessionPool-expiration]: Created new tez session for queue: default with
session id: df027903-43dd-46a8-b654-a25834f2b90d
{code}
ranger logs:
{code:java}
2022-12-07 11:29:20,693 INFO
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
Token cancellation requested for identifier: (ABFS delegation owner=hive,
renewer=yarn, realUser=, issueDate=1670411627352, maxDate=1671016427352,
sequenceNumber=24065, masterKeyId=95)
2022-12-07 11:30:07,316 WARN
org.apache.hadoop.security.ShellBasedUnixGroupsMapping: unable to return groups
for user 9ffea8fa-dec1-49ea-bb45-72bcb43951e8
PartialGroupNameException The user name '9ffea8fa-dec1-49ea-bb45-72bcb43951e8'
is not found. id: 9ffea8fa-dec1-49ea-bb45-72bcb43951e8: no such user
id: 9ffea8fa-dec1-49ea-bb45-72bcb43951e8: no such user
2022-12-07 11:30:07,317 ERROR org.apache.ranger.raz.rest.AuthzREST:
AuthzREST.authorizeAccess()
org.apache.ranger.raz.intg.RangerRazException: not authorized to perform
delete-recursive on path
abfs://[email protected]/warehouse/tablespace/external/hive/sys.db/query_data/date=2021-08-20
at
org.apache.ranger.raz.processor.adls.AdlsGen2RazProcessor.generateDSASToken(AdlsGen2RazProcessor.java:216)
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)