[
https://issues.apache.org/jira/browse/IMPALA-11194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
LiPenglin updated IMPALA-11194:
-------------------------------
Description:
Currently, there is a case where I LOAD DATA from hdfs configured with ranger,
and the following exception occurs:
{code:java}
// sql
LOAD DATA INPATH 'hdfs://...' OVERWRITE INTO TABLE tbl
PARTITION(status='origin');
// impalad exception
org.apache.impala.common.AnalysisException: Unable to LOAD DATA from hdfs://...
because Impala does not have READ permissions on this file
at
org.apache.impala.analysis.LoadDataStmt.analyzePaths(LoadDataStmt.java:194)
at
org.apache.impala.analysis.LoadDataStmt.analyze(LoadDataStmt.java:122)
at
org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:491)
at
org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:451)
at
org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1736)
at
org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1702)
at
org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1672)
at
org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:164)
{code}
According to the org, apache hadoop. Fs. FileStatus# permission, the impalad
process user does not hdfs file owner and do not have permission to read.
{code:java}
[hdfs@hybrid02 ~]$ hdfs dfs -ls -R /user_tag/import_staging/user_tag/
-rw------- 1 hdfs hdfs 270 2022-03-17 20:13
/user_tag/import_staging/user_tag/user_tag_p19_data_3
{code}
{color:#FF0000}*But* {color} I have already authorized the impalad process user
in Ranger, the process user of impalad had actual read and write permissions.
{code:java}
[hdfs@hybrid02 ~]$ klist
Ticket cache: FILE:/tmp/krb5cc_7007
Default principal: impala/hybrid02@SENSORSDATA
[hdfs@hybrid02 ~]$ hdfs dfs -ls -R /user_tag/import_staging/user_tag/
-rw------- 1 hdfs hdfs 270 2022-03-17 20:13
/user_tag/import_staging/user_tag/user_tag_p19_data_3
[hdfs@hybrid02 ~]$ hdfs dfs -get /tmp/user_tag_p19_data_3_test
[hdfs@hybrid02 ~]$ ll -f user_tag_p19_data_3_test
user_tag_p19_data_3_test
{code}
Because in `org.apache.impala.analysis.LoadDataStmt#analyzePaths`, the
permission check on files is mainly based on org.apache hadoop.fs.filestatus #
permission.That's why there are these exception.
was:
Currently, there is a case where I LOAD DATA from hdfs configured with ranger,
and the following exception occurs:
{code:java}
// sql
LOAD DATA INPATH 'hdfs://...' OVERWRITE INTO TABLE tbl
PARTITION(status='origin');
// impalad exception
org.apache.impala.common.AnalysisException: Unable to LOAD DATA from hdfs://...
because Impala does not have READ permissions on this file
at
org.apache.impala.analysis.LoadDataStmt.analyzePaths(LoadDataStmt.java:194)
at
org.apache.impala.analysis.LoadDataStmt.analyze(LoadDataStmt.java:122)
at
org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:491)
at
org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:451)
at
org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1736)
at
org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1702)
at
org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1672)
at
org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:164)
{code}
According to the org, apache hadoop. Fs. FileStatus# permission, the impalad
process user does not hdfs file owner and do not have permission to read, but
to have configured the ranger HDFS, After I authorized the impalad process user
in Ranger, the process user of impalad had actual read and write permissions.
That's why there are these exception.
> Unable to LOAD DATA from HDFS configured with ranger
> ----------------------------------------------------
>
> Key: IMPALA-11194
> URL: https://issues.apache.org/jira/browse/IMPALA-11194
> Project: IMPALA
> Issue Type: Improvement
> Reporter: LiPenglin
> Priority: Major
>
> Currently, there is a case where I LOAD DATA from hdfs configured with
> ranger, and the following exception occurs:
> {code:java}
> // sql
> LOAD DATA INPATH 'hdfs://...' OVERWRITE INTO TABLE tbl
> PARTITION(status='origin');
> // impalad exception
> org.apache.impala.common.AnalysisException: Unable to LOAD DATA from
> hdfs://... because Impala does not have READ permissions on this file
> at
> org.apache.impala.analysis.LoadDataStmt.analyzePaths(LoadDataStmt.java:194)
> at
> org.apache.impala.analysis.LoadDataStmt.analyze(LoadDataStmt.java:122)
> at
> org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:491)
> at
> org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:451)
> at
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1736)
> at
> org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1702)
> at
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1672)
> at
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:164)
> {code}
> According to the org, apache hadoop. Fs. FileStatus# permission, the impalad
> process user does not hdfs file owner and do not have permission to read.
>
> {code:java}
> [hdfs@hybrid02 ~]$ hdfs dfs -ls -R /user_tag/import_staging/user_tag/
> -rw------- 1 hdfs hdfs 270 2022-03-17 20:13
> /user_tag/import_staging/user_tag/user_tag_p19_data_3
> {code}
>
> {color:#FF0000}*But* {color} I have already authorized the impalad process
> user in Ranger, the process user of impalad had actual read and write
> permissions.
>
> {code:java}
> [hdfs@hybrid02 ~]$ klist
> Ticket cache: FILE:/tmp/krb5cc_7007
> Default principal: impala/hybrid02@SENSORSDATA
> [hdfs@hybrid02 ~]$ hdfs dfs -ls -R /user_tag/import_staging/user_tag/
> -rw------- 1 hdfs hdfs 270 2022-03-17 20:13
> /user_tag/import_staging/user_tag/user_tag_p19_data_3
> [hdfs@hybrid02 ~]$ hdfs dfs -get /tmp/user_tag_p19_data_3_test
> [hdfs@hybrid02 ~]$ ll -f user_tag_p19_data_3_test
> user_tag_p19_data_3_test
> {code}
> Because in `org.apache.impala.analysis.LoadDataStmt#analyzePaths`, the
> permission check on files is mainly based on org.apache hadoop.fs.filestatus
> # permission.That's why there are these exception.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]