[ 
https://issues.apache.org/jira/browse/IMPALA-11194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LiPenglin updated IMPALA-11194:
-------------------------------
    Description: 
Currently, there is a case where I LOAD DATA from hdfs configured with ranger, 
and the following exception occurs:
{code:java}
// sql
LOAD DATA INPATH 'hdfs://...' OVERWRITE INTO TABLE tbl 
PARTITION(status='origin');

// impalad exception
org.apache.impala.common.AnalysisException: Unable to LOAD DATA from hdfs://... 
because Impala does not have READ permissions on this file
        at 
org.apache.impala.analysis.LoadDataStmt.analyzePaths(LoadDataStmt.java:194)
        at 
org.apache.impala.analysis.LoadDataStmt.analyze(LoadDataStmt.java:122)
        at 
org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:491)
        at 
org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:451)
        at 
org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1736)
        at 
org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1702)
        at 
org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1672)
        at 
org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:164) 

{code}
According to the org, apache hadoop. Fs. FileStatus# permission, the impalad 
process user does not hdfs file owner and do not have permission to read.

 
{code:java}
[hdfs@hybrid02 ~]$ hdfs dfs -ls -R /user_tag/import_staging/user_tag/
-rw-------   1 hdfs hdfs        270 2022-03-17 20:13 
/user_tag/import_staging/user_tag/user_tag_p19_data_3
{code}
 

{color:#FF0000}*But* {color} I have already authorized the impalad process user 
in Ranger, the process user of impalad had actual read and write permissions.

 
{code:java}
[hdfs@hybrid02 ~]$ klist
Ticket cache: FILE:/tmp/krb5cc_7007
Default principal: impala/hybrid02@SENSORSDATA

[hdfs@hybrid02 ~]$ hdfs dfs -ls -R /user_tag/import_staging/user_tag/ 
-rw-------   1 hdfs hdfs        270 2022-03-17 20:13 
/user_tag/import_staging/user_tag/user_tag_p19_data_3

[hdfs@hybrid02 ~]$ hdfs dfs -get /tmp/user_tag_p19_data_3_test 
[hdfs@hybrid02 ~]$ ll -f user_tag_p19_data_3_test 
user_tag_p19_data_3_test
{code}
Because in `org.apache.impala.analysis.LoadDataStmt#analyzePaths`, the 
permission check on files is mainly based on org.apache hadoop.fs.filestatus # 
permission.That's why there are these exception.

  was:
Currently, there is a case where I LOAD DATA from hdfs configured with ranger, 
and the following exception occurs:
{code:java}
// sql
LOAD DATA INPATH 'hdfs://...' OVERWRITE INTO TABLE tbl 
PARTITION(status='origin');

// impalad exception
org.apache.impala.common.AnalysisException: Unable to LOAD DATA from hdfs://... 
because Impala does not have READ permissions on this file
        at 
org.apache.impala.analysis.LoadDataStmt.analyzePaths(LoadDataStmt.java:194)
        at 
org.apache.impala.analysis.LoadDataStmt.analyze(LoadDataStmt.java:122)
        at 
org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:491)
        at 
org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:451)
        at 
org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1736)
        at 
org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1702)
        at 
org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1672)
        at 
org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:164) 

{code}
According to the org, apache hadoop. Fs. FileStatus# permission, the impalad 
process user does not hdfs file owner and do not have permission to read, but 
to have configured the ranger HDFS, After I authorized the impalad process user 
in Ranger, the process user of impalad had actual read and write permissions. 
That's why there are these exception.


> Unable to LOAD DATA from HDFS configured with ranger
> ----------------------------------------------------
>
>                 Key: IMPALA-11194
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11194
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: LiPenglin
>            Priority: Major
>
> Currently, there is a case where I LOAD DATA from hdfs configured with 
> ranger, and the following exception occurs:
> {code:java}
> // sql
> LOAD DATA INPATH 'hdfs://...' OVERWRITE INTO TABLE tbl 
> PARTITION(status='origin');
> // impalad exception
> org.apache.impala.common.AnalysisException: Unable to LOAD DATA from 
> hdfs://... because Impala does not have READ permissions on this file
>         at 
> org.apache.impala.analysis.LoadDataStmt.analyzePaths(LoadDataStmt.java:194)
>         at 
> org.apache.impala.analysis.LoadDataStmt.analyze(LoadDataStmt.java:122)
>         at 
> org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:491)
>         at 
> org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:451)
>         at 
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1736)
>         at 
> org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1702)
>         at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1672)
>         at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:164) 
> {code}
> According to the org, apache hadoop. Fs. FileStatus# permission, the impalad 
> process user does not hdfs file owner and do not have permission to read.
>  
> {code:java}
> [hdfs@hybrid02 ~]$ hdfs dfs -ls -R /user_tag/import_staging/user_tag/
> -rw-------   1 hdfs hdfs        270 2022-03-17 20:13 
> /user_tag/import_staging/user_tag/user_tag_p19_data_3
> {code}
>  
> {color:#FF0000}*But* {color} I have already authorized the impalad process 
> user in Ranger, the process user of impalad had actual read and write 
> permissions.
>  
> {code:java}
> [hdfs@hybrid02 ~]$ klist
> Ticket cache: FILE:/tmp/krb5cc_7007
> Default principal: impala/hybrid02@SENSORSDATA
> [hdfs@hybrid02 ~]$ hdfs dfs -ls -R /user_tag/import_staging/user_tag/ 
> -rw-------   1 hdfs hdfs        270 2022-03-17 20:13 
> /user_tag/import_staging/user_tag/user_tag_p19_data_3
> [hdfs@hybrid02 ~]$ hdfs dfs -get /tmp/user_tag_p19_data_3_test 
> [hdfs@hybrid02 ~]$ ll -f user_tag_p19_data_3_test 
> user_tag_p19_data_3_test
> {code}
> Because in `org.apache.impala.analysis.LoadDataStmt#analyzePaths`, the 
> permission check on files is mainly based on org.apache hadoop.fs.filestatus 
> # permission.That's why there are these exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to