netapp-acheng commented on issue #3440:
URL: https://github.com/apache/polaris/issues/3440#issuecomment-3762242894
I used below environment variables for Polaris to get the temporary
credentials using AssumeRole
export AWS_ACCESS_KEY_ID=EYGCX5XXXXXXXXXXXXXXXXX
export AWS_SECRET_ACCESS_KEY=+Ea3YoWLXXXXXXXXXXXXXXXXXX
export AWS_ROLE_ARN="arn:aws:iam::123456789101112:role/assumerole"
I created a catalog to use sts
{
"type": "INTERNAL",
"name": "sts1_catalog",
"properties": {
"default-base-location": "s3://sts1-polaris"
},
"createTimestamp": 1768449279366,
"lastUpdateTimestamp": 1768449279366,
"entityVersion": 1,
"storageConfigInfo": {
"storageType": "S3",
"allowedLocations": [
"s3://sts1-polaris"
],
"roleArn": "arn:aws:iam::123456789101112:role/assumerole",
"allowedKmsKeys": [],
"region": "us-east-1",
"endpoint": "https://sgdemo.example.com",
"stsEndpoint": "https://sgdemo.example.com",
"stsUnavailable": false,
"pathStyleAccess": false
}
}
Created a namespace ns1 and table1 in this catalog.
scala> spark.sql("CREATE NAMESPACE IF NOT EXISTS sts1_catalog.ns1")
res8: org.apache.spark.sql.DataFrame = []
scala> spark.sql("""
| CREATE TABLE IF NOT EXISTS sts1_catalog.ns1.table1 (
| id INT,
| data STRING
| )
| USING iceberg
| TBLPROPERTIES ('format-version'='2')
| """)
res9: org.apache.spark.sql.DataFrame = []
I saw Polaris sending AssumeRole request to get the temporary credential and
used the temporary credential to PUT this
/ns1/table1/metadata/00000-0c05b20d-9475-41ac-a6e1-c41a1a49817a.metadata.json
in s3://sts1-polaris
When trying to insert data into this table1
scala> spark.sql("""
| INSERT INTO sts1_catalog.ns1.table1 VALUES
| (1, 'alpha'),
| (2, 'beta'),
| (3, 'gamma')
| """)
26/01/16 17:42:41 ERROR Utils: Aborting task (0 + 3)
/ 3]
java.io.UncheckedIOException: Failed to close current writer
at
org.apache.iceberg.io.RollingFileWriter.closeCurrentWriter(RollingFileWriter.java:128)
at
org.apache.iceberg.io.RollingFileWriter.close(RollingFileWriter.java:156)
at
org.apache.iceberg.io.RollingDataWriter.close(RollingDataWriter.java:32)
at
org.apache.iceberg.spark.source.SparkWrite$UnpartitionedDataWriter.close(SparkWrite.java:778)
at
org.apache.iceberg.spark.source.SparkWrite$UnpartitionedDataWriter.commit(SparkWrite.java:760)
at
org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.$anonfun$run$5(WriteToDataSourceV2Exec.scala:475)
at
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1397)
at
org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run(WriteToDataSourceV2Exec.scala:491)
at
org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run$(WriteToDataSourceV2Exec.scala:430)
at
org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:496)
at
org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.$anonfun$writeWithV2$2(WriteToDataSourceV2Exec.scala:393)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException:
software.amazon.awssdk.services.s3.model.S3Exception: Access Denied (Service:
S3, Status Code: 403, Request ID: 1768603361795122, Extended Request ID:
12295001) (SDK Attempt Count: 1)
From object storage log, I can see the above PUT request was using the
credential from environment variable (i.e. no STS token). Not like the PUT
request when Polaris created the metadata.json (it is using temporary
credential + STS token).
I can send you logs from object storage server side to see the details
(don't want to expose any credential details in public accessible webpage).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]