afiodorov opened a new pull request, #1743:
URL: https://github.com/apache/iceberg-python/pull/1743
I want to a) be able to add files that a partitioned by the filename
convention, e.g. s3://bucket/table/year=2025/month=12
b) add files even if they have extra columns without having to migrate the
table
This comes from a common pattern of having existing hive tables and the need
to migrate them to iceberg.
I propose we can achieve this by doing
`
pattern = re.compile(r"([^/]+)=([^/]+)")
def deduct_partition(path: str) -> Record:
return Record(**dict(pattern.findall(path))
table.add_files(['s3://bucket/table/year=2025/month=12/file.parquet'],
check_schema=False, partition_deductor=deduct_partition)
`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]