lk-1984 opened a new issue, #12565:
URL: https://github.com/apache/iceberg/issues/12565
### Apache Iceberg version
1.8.1 (latest release)
### Query engine
None
### Please describe the bug 🐞
My test setup has MinIO, and that is where the Iceberg tables are.
My config.
```
curl -X POST -H "Content-Type: application/json" --data '{
"name": "iceberg-sink-connector",
"config": {
"connector.class": "org.apache.iceberg.connect.IcebergSinkConnector",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://schema-registry:8081",
"tasks.max": "1",
"topics": "user",
"iceberg.tables": "default.user",
"iceberg.catalog.type": "hive",
"iceberg.catalog.uri": "thrift://hms:9083",
"iceberg.catalog.io-impl": "org.apache.iceberg.hadoop.HadoopFileIO",
"iceberg.catalog.warehouse": "s3a://datalakehouse",
"iceberg.catalog.client.region": "us-east-1",
"iceberg.catalog.s3a.access-key-id": "admin",
"iceberg.catalog.s3a.secret-access-key": "adminadmin",
"iceberg.catalog.s3a.endpoint": "http://minio:9100",
"iceberg.catalog.s3a.path.style.access": "true",
"iceberg.kafka.config.storage.replication.factor": "3",
"iceberg.kafka.offset.storage.replication.factor": "3",
"iceberg.kafka.status.storage.replication.factor": "3",
"iceberg.kafka.cleanup.policy": "compact"
}
}' http://localhost:18083/connectors | jq
```
My Docker compose part.
```
kafka-connect:
image: cp-kafka-connect-base:7.9.0
hostname: kafka-connect
networks:
data-lakehouse:
depends_on:
- broker1
- broker2
- broker3
- schema-registry
ports:
- 18083:8083
environment:
CONNECT_BOOTSTRAP_SERVERS:
"broker-1:29092,broker-2:29093,broker-3:29094"
CONNECT_REST_PORT: 18083
CONNECT_GROUP_ID: kafka-connect
CONNECT_CONFIG_STORAGE_TOPIC: _connect-configs
CONNECT_OFFSET_STORAGE_TOPIC: _connect-offsets
CONNECT_STATUS_STORAGE_TOPIC: _connect-status
CONNECT_KEY_CONVERTER: org.apache.kafka.connect.storage.StringConverter
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL:
'http://schema-registry:8081'
CONNECT_REST_ADVERTISED_HOST_NAME: "kafka-connect"
CONNECT_LOG4J_APPENDER_STDOUT_LAYOUT_CONVERSIONPATTERN: "[%d] %p
%X{connector.context}%m (%c:%L)%n"
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_PLUGIN_PATH:
/usr/share/java,/usr/share/confluent-hub-components,/data/connect-jars
AWS_ACCESS_KEY_ID: "admin"
AWS_SECRET_ACCESS_KEY: "adminadmin"
AWS_ENDPOINT_URL: "http://minio:9100"
volumes:
- ./jars/iceberg:/data/connect-jars
```
```
2025-03-18 14:28:38 [2025-03-18 12:28:38,724] ERROR
[iceberg-sink-connector|task-0] WorkerSinkTask{id=iceberg-sink-connector-0}
Task threw an uncaught and unrecoverable exception. Task is being killed and
will not recover until manually restarted
(org.apache.kafka.connect.runtime.WorkerTask:234)
2025-03-18 14:28:38 org.apache.kafka.connect.errors.ConnectException:
Exiting WorkerSinkTask due to unrecoverable exception.
2025-03-18 14:28:38 at
org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:636)
2025-03-18 14:28:38 at
org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:345)
2025-03-18 14:28:38 at
org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:247)
2025-03-18 14:28:38 at
org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:216)
2025-03-18 14:28:38 at
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:226)
2025-03-18 14:28:38 at
org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:281)
2025-03-18 14:28:38 at
org.apache.kafka.connect.runtime.isolation.Plugins.lambda$withClassLoader$1(Plugins.java:238)
2025-03-18 14:28:38 at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
2025-03-18 14:28:38 at
java.base/java.util.concurrent.FutureTask.run(Unknown Source)
2025-03-18 14:28:38 at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
2025-03-18 14:28:38 at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
2025-03-18 14:28:38 at java.base/java.lang.Thread.run(Unknown Source)
2025-03-18 14:28:38 Caused by:
org.apache.iceberg.exceptions.RuntimeIOException: Failed to open input stream
for file:
s3a://datalakehouse/user/metadata/00000-2e8cd372-fced-4814-abcb-71b57841bc2f.metadata.json
2025-03-18 14:28:38 at
org.apache.iceberg.hadoop.HadoopInputFile.newStream(HadoopInputFile.java:187)
2025-03-18 14:28:38 at
org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:281)
2025-03-18 14:28:38 at
org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:275)
2025-03-18 14:28:38 at
org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$0(BaseMetastoreTableOperations.java:179)
2025-03-18 14:28:38 at
org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$1(BaseMetastoreTableOperations.java:198)
2025-03-18 14:28:38 at
org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
2025-03-18 14:28:38 at
org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
2025-03-18 14:28:38 at
org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
2025-03-18 14:28:38 at
org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
2025-03-18 14:28:38 at
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:198)
2025-03-18 14:28:38 at
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:175)
2025-03-18 14:28:38 at
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:170)
2025-03-18 14:28:38 at
org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:167)
2025-03-18 14:28:38 at
org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:87)
2025-03-18 14:28:38 at
org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:70)
2025-03-18 14:28:38 at
org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:49)
2025-03-18 14:28:38 at
org.apache.iceberg.connect.data.IcebergWriterFactory.createWriter(IcebergWriterFactory.java:59)
2025-03-18 14:28:38 at
org.apache.iceberg.connect.data.SinkWriter.lambda$writerForTable$3(SinkWriter.java:139)
2025-03-18 14:28:38 at
java.base/java.util.HashMap.computeIfAbsent(Unknown Source)
2025-03-18 14:28:38 at
org.apache.iceberg.connect.data.SinkWriter.writerForTable(SinkWriter.java:138)
2025-03-18 14:28:38 at
org.apache.iceberg.connect.data.SinkWriter.lambda$routeRecordStatically$1(SinkWriter.java:98)
2025-03-18 14:28:38 at
java.base/java.util.Arrays$ArrayList.forEach(Unknown Source)
2025-03-18 14:28:38 at
org.apache.iceberg.connect.data.SinkWriter.routeRecordStatically(SinkWriter.java:96)
2025-03-18 14:28:38 at
org.apache.iceberg.connect.data.SinkWriter.save(SinkWriter.java:85)
2025-03-18 14:28:38 at java.base/java.util.ArrayList.forEach(Unknown
Source)
2025-03-18 14:28:38 at
org.apache.iceberg.connect.data.SinkWriter.save(SinkWriter.java:68)
2025-03-18 14:28:38 at
org.apache.iceberg.connect.channel.Worker.save(Worker.java:124)
2025-03-18 14:28:38 at
org.apache.iceberg.connect.channel.CommitterImpl.save(CommitterImpl.java:88)
2025-03-18 14:28:38 at
org.apache.iceberg.connect.IcebergSinkTask.put(IcebergSinkTask.java:87)
2025-03-18 14:28:38 at
org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:606)
2025-03-18 14:28:38 ... 11 more
2025-03-18 14:28:38 Caused by: java.net.UnknownHostException: getFileStatus
on
s3a://datalakehouse/user/metadata/00000-2e8cd372-fced-4814-abcb-71b57841bc2f.metadata.json:
software.amazon.awssdk.core.exception.SdkClientException: Received an
UnknownHostException when attempting to interact with a service. See cause for
the exact endpoint that is failing to resolve. If this is happening on an
endpoint that previously worked, there may be a network connectivity issue or
your DNS cache could be storing endpoints for too long.:
software.amazon.awssdk.core.exception.SdkClientException: Received an
UnknownHostException when attempting to interact with a service. See cause for
the exact endpoint that is failing to resolve. If this is happening on an
endpoint that previously worked, there may be a network connectivity issue or
your DNS cache could be storing endpoints for too long.: datalakehouse.minio
2025-03-18 14:28:38 at
java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
2025-03-18 14:28:38 at
java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(Unknown
Source)
2025-03-18 14:28:38 at
java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
Source)
2025-03-18 14:28:38 at
java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Unknown Source)
2025-03-18 14:28:38 at
java.base/java.lang.reflect.Constructor.newInstance(Unknown Source)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.s3a.impl.ErrorTranslation.wrapWithInnerIOE(ErrorTranslation.java:182)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.s3a.impl.ErrorTranslation.maybeExtractIOException(ErrorTranslation.java:152)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:208)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:156)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4104)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:4007)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.s3a.S3AFileSystem.extractOrFetchSimpleFileStatus(S3AFileSystem.java:5649)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$executeOpen$4(S3AFileSystem.java:1861)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.s3a.S3AFileSystem.executeOpen(S3AFileSystem.java:1859)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:1834)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.FileSystem.open(FileSystem.java:997)
2025-03-18 14:28:38 at
org.apache.iceberg.hadoop.HadoopInputFile.newStream(HadoopInputFile.java:183)
2025-03-18 14:28:38 ... 40 more
2025-03-18 14:28:38 Caused by:
software.amazon.awssdk.core.exception.SdkClientException: Received an
UnknownHostException when attempting to interact with a service. See cause for
the exact endpoint that is failing to resolve. If this is happening on an
endpoint that previously worked, there may be a network connectivity issue or
your DNS cache could be storing endpoints for too long.
2025-03-18 14:28:38 at
software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
2025-03-18 14:28:38 at
software.amazon.awssdk.awscore.interceptor.HelpfulUnknownHostExceptionInterceptor.modifyException(HelpfulUnknownHostExceptionInterceptor.java:59)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.modifyException(ExecutionInterceptorChain.java:181)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.utils.ExceptionReportingUtils.runModifyException(ExceptionReportingUtils.java:54)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.utils.ExceptionReportingUtils.reportFailureToInterceptors(ExceptionReportingUtils.java:38)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:39)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:210)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:173)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:80)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:182)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:74)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45)
2025-03-18 14:28:38 at
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)
2025-03-18 14:28:38 at
software.amazon.awssdk.services.s3.DefaultS3Client.headObject(DefaultS3Client.java:7029)
2025-03-18 14:28:38 at
software.amazon.awssdk.services.s3.DelegatingS3Client.lambda$headObject$56(DelegatingS3Client.java:5678)
2025-03-18 14:28:38 at
software.amazon.awssdk.services.s3.internal.crossregion.S3CrossRegionSyncClient.invokeOperation(S3CrossRegionSyncClient.java:67)
2025-03-18 14:28:38 at
software.amazon.awssdk.services.s3.DelegatingS3Client.headObject(DelegatingS3Client.java:5678)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getObjectMetadata$10(S3AFileSystem.java:3049)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:431)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:3036)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:3016)
2025-03-18 14:28:38 at
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4079)
2025-03-18 14:28:38 ... 50 more
2025-03-18 14:28:38 Caused by:
software.amazon.awssdk.core.exception.SdkClientException: Unable to execute
HTTP request: datalakehouse.minio
2025-03-18 14:28:38 at
software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:47)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper2.setLastException(RetryableStageHelper2.java:226)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage2.execute(RetryableStage2.java:65)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage2.execute(RetryableStage2.java:36)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:53)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:35)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:82)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:43)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
2025-03-18 14:28:38 ... 69 more
2025-03-18 14:28:38 Suppressed:
software.amazon.awssdk.core.exception.SdkClientException: Request attempt 1
failure: Unable to execute HTTP request: datalakehouse.minio
2025-03-18 14:28:38 Suppressed:
software.amazon.awssdk.core.exception.SdkClientException: Request attempt 2
failure: Unable to execute HTTP request: datalakehouse.minio
2025-03-18 14:28:38 Suppressed:
software.amazon.awssdk.core.exception.SdkClientException: Request attempt 3
failure: Unable to execute HTTP request: datalakehouse.minio
2025-03-18 14:28:38 Suppressed:
software.amazon.awssdk.core.exception.SdkClientException: Request attempt 4
failure: Unable to execute HTTP request: datalakehouse.minio
2025-03-18 14:28:38 Suppressed:
software.amazon.awssdk.core.exception.SdkClientException: Request attempt 5
failure: Unable to execute HTTP request: datalakehouse.minio
2025-03-18 14:28:38 Caused by: java.net.UnknownHostException:
datalakehouse.minio
2025-03-18 14:28:38 at
java.base/java.net.InetAddress$CachedAddresses.get(Unknown Source)
2025-03-18 14:28:38 at
java.base/java.net.InetAddress.getAllByName0(Unknown Source)
2025-03-18 14:28:38 at
java.base/java.net.InetAddress.getAllByName(Unknown Source)
2025-03-18 14:28:38 at
java.base/java.net.InetAddress.getAllByName(Unknown Source)
2025-03-18 14:28:38 at
org.apache.http.impl.conn.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:45)
2025-03-18 14:28:38 at
org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:112)
2025-03-18 14:28:38 at
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
2025-03-18 14:28:38 at
software.amazon.awssdk.http.apache.internal.conn.ClientConnectionManagerFactory$DelegatingHttpClientConnectionManager.connect(ClientConnectionManagerFactory.java:86)
2025-03-18 14:28:38 at
org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
2025-03-18 14:28:38 at
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
2025-03-18 14:28:38 at
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
2025-03-18 14:28:38 at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
2025-03-18 14:28:38 at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
2025-03-18 14:28:38 at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
2025-03-18 14:28:38 at
software.amazon.awssdk.http.apache.internal.impl.ApacheSdkHttpClient.execute(ApacheSdkHttpClient.java:72)
2025-03-18 14:28:38 at
software.amazon.awssdk.http.apache.ApacheHttpClient.execute(ApacheHttpClient.java:254)
2025-03-18 14:28:38 at
software.amazon.awssdk.http.apache.ApacheHttpClient.access$500(ApacheHttpClient.java:104)
2025-03-18 14:28:38 at
software.amazon.awssdk.http.apache.ApacheHttpClient$1.call(ApacheHttpClient.java:231)
2025-03-18 14:28:38 at
software.amazon.awssdk.http.apache.ApacheHttpClient$1.call(ApacheHttpClient.java:228)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.util.MetricUtils.measureDurationUnsafe(MetricUtils.java:102)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.MakeHttpRequestStage.executeHttpRequest(MakeHttpRequestStage.java:79)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.MakeHttpRequestStage.execute(MakeHttpRequestStage.java:57)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.MakeHttpRequestStage.execute(MakeHttpRequestStage.java:40)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:74)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:43)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:79)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:41)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:55)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:39)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage2.executeRequest(RetryableStage2.java:93)
2025-03-18 14:28:38 at
software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage2.execute(RetryableStage2.java:56)
2025-03-18 14:28:38 ... 81 more
```
I had to add the AWS S3 env variables, as otherwise I get.
```
2025-03-18 14:33:06 [2025-03-18 12:33:06,076] ERROR
[iceberg-sink-connector|task-0] WorkerSinkTask{id=iceberg-sink-connector-0}
Task threw an uncaught and unrecoverable exception. Task is being killed and
will not recover until manually restarted. Error: Failed to open input stream
for file:
s3a://datalakehouse/user/metadata/00000-2e8cd372-fced-4814-abcb-71b57841bc2f.metadata.json
(org.apache.kafka.connect.runtime.WorkerSinkTask:634)
2025-03-18 14:33:06 org.apache.iceberg.exceptions.RuntimeIOException: Failed
to open input stream for file:
s3a://datalakehouse/user/metadata/00000-2e8cd372-fced-4814-abcb-71b57841bc2f.metadata.json
2025-03-18 14:33:06 at
org.apache.iceberg.hadoop.HadoopInputFile.newStream(HadoopInputFile.java:187)
2025-03-18 14:33:06 at
org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:281)
2025-03-18 14:33:06 at
org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:275)
2025-03-18 14:33:06 at
org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$0(BaseMetastoreTableOperations.java:179)
2025-03-18 14:33:06 at
org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$1(BaseMetastoreTableOperations.java:198)
2025-03-18 14:33:06 at
org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
2025-03-18 14:33:06 at
org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
2025-03-18 14:33:06 at
org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
2025-03-18 14:33:06 at
org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
2025-03-18 14:33:06 at
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:198)
2025-03-18 14:33:06 at
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:175)
2025-03-18 14:33:06 at
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:170)
2025-03-18 14:33:06 at
org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:167)
2025-03-18 14:33:06 at
org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:87)
2025-03-18 14:33:06 at
org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:70)
2025-03-18 14:33:06 at
org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:49)
2025-03-18 14:33:06 at
org.apache.iceberg.connect.data.IcebergWriterFactory.createWriter(IcebergWriterFactory.java:59)
2025-03-18 14:33:06 at
org.apache.iceberg.connect.data.SinkWriter.lambda$writerForTable$3(SinkWriter.java:139)
2025-03-18 14:33:06 at
java.base/java.util.HashMap.computeIfAbsent(Unknown Source)
2025-03-18 14:33:06 at
org.apache.iceberg.connect.data.SinkWriter.writerForTable(SinkWriter.java:138)
2025-03-18 14:33:06 at
org.apache.iceberg.connect.data.SinkWriter.lambda$routeRecordStatically$1(SinkWriter.java:98)
2025-03-18 14:33:06 at
java.base/java.util.Arrays$ArrayList.forEach(Unknown Source)
2025-03-18 14:33:06 at
org.apache.iceberg.connect.data.SinkWriter.routeRecordStatically(SinkWriter.java:96)
2025-03-18 14:33:06 at
org.apache.iceberg.connect.data.SinkWriter.save(SinkWriter.java:85)
2025-03-18 14:33:06 at java.base/java.util.ArrayList.forEach(Unknown
Source)
2025-03-18 14:33:06 at
org.apache.iceberg.connect.data.SinkWriter.save(SinkWriter.java:68)
2025-03-18 14:33:06 at
org.apache.iceberg.connect.channel.Worker.save(Worker.java:124)
2025-03-18 14:33:06 at
org.apache.iceberg.connect.channel.CommitterImpl.save(CommitterImpl.java:88)
2025-03-18 14:33:06 at
org.apache.iceberg.connect.IcebergSinkTask.put(IcebergSinkTask.java:87)
2025-03-18 14:33:06 at
org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:606)
2025-03-18 14:33:06 at
org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:345)
2025-03-18 14:33:06 at
org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:247)
2025-03-18 14:33:06 at
org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:216)
2025-03-18 14:33:06 at
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:226)
2025-03-18 14:33:06 at
org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:281)
2025-03-18 14:33:06 at
org.apache.kafka.connect.runtime.isolation.Plugins.lambda$withClassLoader$1(Plugins.java:238)
2025-03-18 14:33:06 at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
2025-03-18 14:33:06 at
java.base/java.util.concurrent.FutureTask.run(Unknown Source)
2025-03-18 14:33:06 at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
2025-03-18 14:33:06 at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
2025-03-18 14:33:06 at java.base/java.lang.Thread.run(Unknown Source)
2025-03-18 14:33:06 Caused by: java.nio.file.AccessDeniedException:
s3a://datalakehouse/user/metadata/00000-2e8cd372-fced-4814-abcb-71b57841bc2f.metadata.json:
org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials
provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider
EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider :
software.amazon.awssdk.core.exception.SdkClientException: Unable to load
credentials from system settings. Access key must be specified either via
environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId).
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.AWSCredentialProviderList.maybeTranslateCredentialException(AWSCredentialProviderList.java:351)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:203)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:156)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4104)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:4007)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.S3AFileSystem.extractOrFetchSimpleFileStatus(S3AFileSystem.java:5649)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$executeOpen$4(S3AFileSystem.java:1861)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.S3AFileSystem.executeOpen(S3AFileSystem.java:1859)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:1834)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.FileSystem.open(FileSystem.java:997)
2025-03-18 14:33:06 at
org.apache.iceberg.hadoop.HadoopInputFile.newStream(HadoopInputFile.java:183)
2025-03-18 14:33:06 ... 40 more
2025-03-18 14:33:06 Caused by:
org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials
provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider
EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider :
software.amazon.awssdk.core.exception.SdkClientException: Unable to load
credentials from system settings. Access key must be specified either via
environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId).
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.AWSCredentialProviderList.resolveCredentials(AWSCredentialProviderList.java:214)
2025-03-18 14:33:06 at
software.amazon.awssdk.auth.credentials.AwsCredentialsProvider.resolveIdentity(AwsCredentialsProvider.java:54)
2025-03-18 14:33:06 at
software.amazon.awssdk.services.s3.auth.scheme.internal.S3AuthSchemeInterceptor.lambda$trySelectAuthScheme$4(S3AuthSchemeInterceptor.java:163)
2025-03-18 14:33:06 at
software.amazon.awssdk.core.internal.util.MetricUtils.reportDuration(MetricUtils.java:80)
2025-03-18 14:33:06 at
software.amazon.awssdk.services.s3.auth.scheme.internal.S3AuthSchemeInterceptor.trySelectAuthScheme(S3AuthSchemeInterceptor.java:163)
2025-03-18 14:33:06 at
software.amazon.awssdk.services.s3.auth.scheme.internal.S3AuthSchemeInterceptor.selectAuthScheme(S3AuthSchemeInterceptor.java:84)
2025-03-18 14:33:06 at
software.amazon.awssdk.services.s3.auth.scheme.internal.S3AuthSchemeInterceptor.beforeExecution(S3AuthSchemeInterceptor.java:64)
2025-03-18 14:33:06 at
software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.lambda$beforeExecution$1(ExecutionInterceptorChain.java:59)
2025-03-18 14:33:06 at java.base/java.util.ArrayList.forEach(Unknown
Source)
2025-03-18 14:33:06 at
software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.beforeExecution(ExecutionInterceptorChain.java:59)
2025-03-18 14:33:06 at
software.amazon.awssdk.awscore.internal.AwsExecutionContextBuilder.runInitialInterceptors(AwsExecutionContextBuilder.java:248)
2025-03-18 14:33:06 at
software.amazon.awssdk.awscore.internal.AwsExecutionContextBuilder.invokeInterceptorsAndCreateExecutionContext(AwsExecutionContextBuilder.java:138)
2025-03-18 14:33:06 at
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.invokeInterceptorsAndCreateExecutionContext(AwsSyncClientHandler.java:67)
2025-03-18 14:33:06 at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:76)
2025-03-18 14:33:06 at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:182)
2025-03-18 14:33:06 at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:74)
2025-03-18 14:33:06 at
software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45)
2025-03-18 14:33:06 at
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)
2025-03-18 14:33:06 at
software.amazon.awssdk.services.s3.DefaultS3Client.headObject(DefaultS3Client.java:7029)
2025-03-18 14:33:06 at
software.amazon.awssdk.services.s3.DelegatingS3Client.lambda$headObject$56(DelegatingS3Client.java:5678)
2025-03-18 14:33:06 at
software.amazon.awssdk.services.s3.internal.crossregion.S3CrossRegionSyncClient.invokeOperation(S3CrossRegionSyncClient.java:67)
2025-03-18 14:33:06 at
software.amazon.awssdk.services.s3.DelegatingS3Client.headObject(DelegatingS3Client.java:5678)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getObjectMetadata$10(S3AFileSystem.java:3049)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:431)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:3036)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:3016)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4079)
2025-03-18 14:33:06 ... 50 more
2025-03-18 14:33:06 Caused by:
software.amazon.awssdk.core.exception.SdkClientException: Unable to load
credentials from system settings. Access key must be specified either via
environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId).
2025-03-18 14:33:06 at
software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
2025-03-18 14:33:06 at
software.amazon.awssdk.auth.credentials.internal.SystemSettingsCredentialsProvider.resolveCredentials(SystemSettingsCredentialsProvider.java:60)
2025-03-18 14:33:06 at
org.apache.hadoop.fs.s3a.AWSCredentialProviderList.resolveCredentials(AWSCredentialProviderList.java:175)
2025-03-18 14:33:06 ... 77 more
```
### Willingness to contribute
- [ ] I can contribute a fix for this bug independently
- [ ] I would be willing to contribute a fix for this bug with guidance from
the Iceberg community
- [ ] I cannot contribute a fix for this bug at this time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]