lozbrown opened a new issue, #1775:
URL: https://github.com/apache/iceberg-python/issues/1775
I'm trying to use pyiceberg within a pod that has access via a role
I've configured PYICEBERG_CATALOG__DEFAULT__S3__ROLE_ARN and AWS_ROLE_ARN
environment variables but that fails with a Headobject issue
```
File
"/usr/local/lib64/python3.12/site-packages/pyiceberg/catalog/__init__.py", line
420, in create_table_if_not_exists
return self.create_table(identifier, schema, location, partition_spec,
sort_order, properties)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/lib64/python3.12/site-packages/pyiceberg/catalog/hive.py", line
404, in create_table
self._write_metadata(staged_table.metadata, staged_table.io,
staged_table.metadata_location)
File
"/usr/local/lib64/python3.12/site-packages/pyiceberg/catalog/__init__.py", line
939, in _write_metadata
ToOutputFile.table_metadata(metadata, io.new_output(metadata_path))
File "/usr/local/lib64/python3.12/site-packages/pyiceberg/serializers.py",
line 130, in table_metadata
with output_file.create(overwrite=overwrite) as output_stream:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib64/python3.12/site-packages/pyiceberg/io/pyarrow.py",
line 338, in create
if not overwrite and self.exists() is True:
^^^^^^^^^^^^^
File "/usr/local/lib64/python3.12/site-packages/pyiceberg/io/pyarrow.py",
line 282, in exists
self._file_info() # raises FileNotFoundError if it does not exist
^^^^^^^^^^^^^^^^^
File "/usr/local/lib64/python3.12/site-packages/pyiceberg/io/pyarrow.py",
line 264, in _file_info
file_info = self._filesystem.get_file_info(self._path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pyarrow/_fs.pyx", line 590, in pyarrow._fs.FileSystem.get_file_info
File "pyarrow/error.pxi", line 155, in
pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
OSError: When getting information for key
'schemas/meta.db/trino_queries_iceberg/metadata/00000-41568416-bc76-4236-afab-a7bec772eb32.metadata.json'
in bucket 'REDACTED-BUCKET': AWS Error ACCESS_DENIED during HeadObject
operation: No response body.
```
I believe that's this Pyarrow issue
https://github.com/apache/arrow/issues/38421
To get past this I set these keys upstream my python
```
import boto3
session = boto3.session.Session()
os.environ['AWS_ACCESS_KEY_ID']=session.get_credentials().access_key
os.environ['AWS_SECRET_ACCESS_KEY']=session.get_credentials().secret_key
os.environ['AWS_SESSION_TOKEN']=session.get_credentials().token
os.environ['PYICEBERG_CATALOG__DEFAULT__S3__ACCESS_KEY_ID']=session.get_credentials().access_key
os.environ['PYICEBERG_CATALOG__DEFAULT__S3__SECRET_ACCESS_KEY']=session.get_credentials().secret_key
os.environ['PYICEBERG_CATALOG__DEFAULT__S3__SESSION_TOKEN']=session.get_credentials().token
```
That gets me past that issue but then we get to the below java error
```
[trino@testpod /]$ python query_logger_iceberg.py
Traceback (most recent call last):
File "//query_logger_iceberg.py", line 57, in <module>
table = catalog.create_table_if_not_exists(
identifier=f'{trino_schema_name}.{tablename}',schema=iceberg_table_schema,
partition_spec=partition_spec, location=s3_location,sort_order=sort_order )
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/lib64/python3.12/site-packages/pyiceberg/catalog/__init__.py", line
420, in create_table_if_not_exists
return self.create_table(identifier, schema, location, partition_spec,
sort_order, properties)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/lib64/python3.12/site-packages/pyiceberg/catalog/hive.py", line
408, in create_table
self._create_hive_table(open_client, tbl)
File
"/usr/local/lib64/python3.12/site-packages/pyiceberg/catalog/hive.py", line
357, in _create_hive_table
open_client.create_table(hive_table)
File
"/usr/local/lib64/python3.12/site-packages/hive_metastore/ThriftHiveMetastore.py",
line 3431, in create_table
self.recv_create_table()
File
"/usr/local/lib64/python3.12/site-packages/hive_metastore/ThriftHiveMetastore.py",
line 3457, in recv_create_table
raise result.o3
hive_metastore.ttypes.MetaException:
MetaException(message='java.lang.IllegalArgumentException: AWS Access Key ID
and Secret Access Key must be specified by setting the fs.s3.awsAccessKeyId and
fs.s3.awsSecretAccessKey properties (respectively).')
```
Given that I've already supplied the keys should they not be passed on?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]