jessiedanwang opened a new issue, #5381:
URL: https://github.com/apache/iceberg/issues/5381
We have 2 aws accounts. Account_a has role_a which has permission to assume
role (role_b in account_b), where we run EMR using role_a, trying to access
Glue catalog in account_b
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": [
"arn:aws:iam::account_b:role/role_b"
]
}
]
}
account_b has role_b which has trust policy that trusts role_a in account_a
as well as permission to access Glue catalog in account_b.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::account_a:role/role_a"
},
"Action": "sts:AssumeRole"
}
]
}
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"glue:*"
],
"Resource": [
"arn:aws:glue:us-east-2:account_b:catalog",
"arn:aws:glue:us-east-2:account_b:database/default",
"arn:aws:glue:us-east-2:account_b:database/iceberg_db",
"arn:aws:glue:us-east-2:account_b:table/iceberg_db/*"
]
}
]
}
resource policy on Glue catalog in account_b,
{
"Effect" : "Allow",
"Principal" : {
"AWS" : "arn:aws:iam::account_b:role/role_b"
},
"Action" : "glue:*",
"Resource" : [ "arn:aws:glue:us-east-2:account_b:catalog",
"arn:aws:glue:us-east-2:account_b:database/default",
"arn:aws:glue:us-east-2:account_b:database/iceberg_db",
"arn:aws:glue:us-east-2:account_b:table/iceberg_db/*" ]
}
We run spark shell on EMR cluster (which has role_a) in account_a using the
following command,
spark-shell --packages
org.apache.iceberg:iceberg-spark-runtime-3.1_2.12:0.14.0,
software.amazon.awssdk:bundle:2.17.207,software.amazon.awssdk:url-connection-client:2.17.207
\
--conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \
--conf spark.sql.catalog.my_catalog.warehouse=s3://my_bucket/my_prefix \
--conf
spark.sql.catalog.my_catalog.catalog-impl=org.apache.iceberg.aws.glue.GlueCatalog
\
--conf
spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \
--conf
spark.sql.catalog.my_catalog.client.factory=org.apache.iceberg.aws.AssumeRoleAwsClientFactory
\
--conf
spark.sql.catalog.my_catalog.client.assume-role.arn=arn:aws:iam::account_b:role/role_b
\
--conf spark.sql.catalog.my_catalog.client.assume-role.region=us-east-2 \
--conf spark.hadoop.hive.metastore.glue.catalogid=account_b
then from spark-shell, we tried to do 'spark.sql("show databases")', but got
the following error,
ERROR GlueMetastoreClientDelegate:
com.amazonaws.services.glue.model.AccessDeniedException: User:
arn:aws:sts::account_a:assumed-role/role_a/i-xxxxxx is not authorized to
perform: glue:GetUserDefinedFunctions on resource:
arn:aws:glue:us-east-2:account_b:catalog because no resource-based policy
allows the glue:GetUserDefinedFunctions action (Service: AWSGlue; Status Code:
400; Error Code: AccessDeniedException; Request ID:
f41d74ba-18c9-47d3-a751-4b38f60643a1; Proxy: null)
The error above seems to indicate that role_b is not being assumed by role_a
at all. I would appreciate any advise why role_b is not being assumed in this
case? Thanks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]