[ 
https://issues.apache.org/jira/browse/IMPALA-11194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17508773#comment-17508773
 ] 

Quanlong Huang commented on IMPALA-11194:
-----------------------------------------

Hi [~LiPenglin] , thank for reporting this! However, I think this duplicates 
IMPALA-10272.

Do you want to work on IMPALA-10272 instead?

CC [~fangyurao] 

> Unable to LOAD DATA from HDFS configured with ranger
> ----------------------------------------------------
>
>                 Key: IMPALA-11194
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11194
>             Project: IMPALA
>          Issue Type: Improvement
>    Affects Versions: Impala 4.0.0
>            Reporter: LiPenglin
>            Priority: Major
>
> Currently, there is a case where I LOAD DATA from hdfs configured with 
> ranger, and the following exception occurs:
> {code:java}
> // sql
> LOAD DATA INPATH 'hdfs://...' OVERWRITE INTO TABLE tbl 
> PARTITION(status='origin');
> // impalad exception
> org.apache.impala.common.AnalysisException: Unable to LOAD DATA from 
> hdfs://... because Impala does not have READ permissions on this file
>         at 
> org.apache.impala.analysis.LoadDataStmt.analyzePaths(LoadDataStmt.java:194)
>         at 
> org.apache.impala.analysis.LoadDataStmt.analyze(LoadDataStmt.java:122)
>         at 
> org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:491)
>         at 
> org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:451)
>         at 
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1736)
>         at 
> org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1702)
>         at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1672)
>         at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:164) 
> {code}
> According to the `org.apache hadoop. Fs. FileStatus# permission`, the impalad 
> process user does not hdfs file owner and do not have permission to read.
> {code:java}
> [hdfs@hybrid02 ~]$ hdfs dfs -ls -R /user_tag/import_staging/user_tag/
> -rw-------   1 hdfs hdfs        270 2022-03-17 20:13 
> /user_tag/import_staging/user_tag/user_tag_p19_data_3
> {code}
> {color:#ff0000}*But* {color} I have already authorized the impalad process 
> user in Ranger, the process user of impalad had actual read and write 
> permissions.
> {code:java}
> [hdfs@hybrid02 ~]$ klist
> Ticket cache: FILE:/tmp/krb5cc_7007
> Default principal: impala/hybrid02@SENSORSDATA
> [hdfs@hybrid02 ~]$ hdfs dfs -ls -R /user_tag/import_staging/user_tag/ 
> -rw-------   1 hdfs hdfs        270 2022-03-17 20:13 
> /user_tag/import_staging/user_tag/user_tag_p19_data_3
> [hdfs@hybrid02 ~]$ hdfs dfs -get /tmp/user_tag_p19_data_3_test 
> [hdfs@hybrid02 ~]$ ll -f user_tag_p19_data_3_test 
> user_tag_p19_data_3_test
> {code}
> In my opinion, because in 
> `org.apache.impala.analysis.LoadDataStmt#analyzePaths`, the permission check 
> on files is mainly based on `org.apache hadoop.fs.filestatus # 
> permission`.That's why there are these exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to