Omri-Ben-Yair opened a new issue, #13354:
URL: https://github.com/apache/iceberg/issues/13354
### Apache Iceberg version
1.8.1
### Query engine
Spark
### Please describe the bug 🐞
I'm trying to access a table with a role from account A and the table is in
a remote AWS account B.
### I'm using the pyspark 3.5.5 with the following jars:
**aws-java-sdk-bundle-1.12.262.jar
iceberg-aws-bundle-1.8.1.jar
iceberg-spark-runtime-3.5_2.12-1.8.1.jar**
### Using the following code to build the spark Session:
```
SparkSession.builder
.appName('BronzeToSilverETL') \
.config('spark.sql.warehouse.dir', warehouse)
.config('spark.sql.extensions',
'org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions')
.config(f'spark.sql.catalog.{catalog_name}',
'org.apache.iceberg.spark.SparkCatalog')
.config(f'spark.sql.catalog.{catalog_name}.warehouse', warehouse)
.config(f'spark.sql.catalog.{catalog_name}.catalog-impl',
'org.apache.iceberg.aws.glue.GlueCatalog')
.config(f'spark.sql.catalog.{catalog_name}.io-impl',
'org.apache.iceberg.aws.s3.S3FileIO')
.config(f'spark.sql.catalog.{catalog_name}.catalog-id',
catalog_account_id)
.config('spark.hadoop.fs.s3a.endpoint', 's3.amazonaws.com')
.config(f'spark.sql.catalog.{catalog_name}.client.factory',
'org.apache.iceberg.aws.lakeformation.LakeFormationAwsClientFactory') \
.config(f"spark.sql.catalog.{catalog_name}.glue.lakeformation-enabled","true") \
.config(f"spark.sql.catalog.{catalog_name}.client.assume-role.arn",
role_arn_to_assume) \
.config(f"spark.sql.catalog.{catalog_name}.client.assume-role.region",
Session().region_name) \
.config(f"spark.sql.catalog.{catalog_name}.client.assume-role.tags.LakeFormationAuthorizedCaller",
"omri") \
.config(f"spark.sql.catalog.{catalog_name}.glue.id",
catalog_account_id) \
.config(f"spark.sql.catalog.{catalog_name}.glue.account-id",
catalog_account_id)
```
Trying to run for example `spark.catalog.tableExists(tableName=target_table)`
### This is the error i'm getting:
```
py4j.protocol.Py4JJavaError: An error occurred while calling o55.tableExists.
: software.amazon.awssdk.services.lakeformation.model.AccessDeniedException:
Access is not allowed. (Service: LakeFormation, Status Code: 400, Request ID:
dd025144-c3cb-4dcc-b6b7-b488c3130fcf)
at
software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handleErrorResponse(CombinedResponseHandler.java:124)
at
software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handleResponse(CombinedResponseHandler.java:81)
at
software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handle(CombinedResponseHandler.java:59)
at
software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handle(CombinedResponseHandler.java:40)
at
software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:50)
at
software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:38)
....
```
### Diagnostic:
- I've verified the lakeformation/iam permissions, I'm able to querty the
table iwth the same role via Athena, so i'm sure at this point the issue is in
the spark client
- Followed AWS guide for 3th party clients:
https://docs.aws.amazon.com/lake-formation/latest/dg/permitting-third-party-call.html
- this is the CloudTrail event for LakeFormation GetDataAccess event which
fails:
```
"errorCode": "AccessDenied",
"errorMessage": "An unknown error occurred",
"requestParameters": {
"tableArn":
"arn:aws:glue:us-east-1:053348864140:table/bronze_dps1_testdb_s_1_omribenmaster_dpltest/bronze_to_silver_table",
"permissions": []
},
"responseElements": null,
"additionalEventData": {
"requesterService": "UNKNOWN",
"LakeFormationAuthorizedSessionTag":
"LakeFormationAuthorizedCaller:my_tag",
"LakeFormationTrustedCallerInvocation": "true"
},
```
- this is the CloudTrail event for LakeFormation GetDataAccess event which
succeed for refference:
```
{
"requestParameters": {
"tableArn": "arn:aws:glue:REGION:ACCOUNT_ID:table/DATABASE/TABLE_NAME",
"permissions": [
"SELECT"
],
"auditContext": {
"additionalAuditContext": {
"queryId": "QUERY_ID"
}
},
"cellLevelSecurityEnforced": true,
"expectedTableId": "EXPECTED_TABLE_ID"
},
"responseElements": null,
"additionalEventData": {
"requesterService": "ATHENA",
"LakeFormationTrustedCallerInvocation": "true",
"lakeFormationPrincipal":
"arn:aws:iam::ACCOUNT_ID:role/ROLE_PATH/ROLE_NAME",
"lakeFormationRoleSessionName": "SESSION_NAME"
}
}
```
### Willingness to contribute
- [ ] I can contribute a fix for this bug independently
- [ ] I would be willing to contribute a fix for this bug with guidance from
the Iceberg community
- [x] I cannot contribute a fix for this bug at this time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]