syun64 commented on code in PR #289:
URL: https://github.com/apache/iceberg-python/pull/289#discussion_r1472905954
##########
tests/catalog/test_base.py:
##########
@@ -355,8 +162,6 @@ def test_create_table_pyarrow_schema(catalog:
InMemoryCatalog, pyarrow_schema_si
table = catalog.create_table(
identifier=TEST_TABLE_IDENTIFIER,
schema=pyarrow_schema_simple_without_ids,
- location=TEST_TABLE_LOCATION,
- partition_spec=TEST_TABLE_PARTITION_SPEC,
Review Comment:
Hey @kevinjqliu thank you for flagging this 😄 I think '-1' ID discrepancy is
the symptom of the issue that makes the issue easy to understand, just as we
decided in
https://github.com/apache/iceberg-python/pull/305#discussion_r1469710203
The root cause of the issue I think is that we are introducing a way for
non-ID's schema (PyArrow Schema) to be used as an input into create_table,
while not supporting the same for partition_spec and sort_order (PartitionField
and SortField both require field IDs as inputs).
So I think we should update both
[assign_fresh_partition_spec_ids](https://github.com/apache/iceberg-python/blob/102e043b182a0b7bf4c9f66c5d77b24eedbd3766/pyiceberg/partitioning.py#L200)
and
[assign_fresh_sort_order_ids](https://github.com/apache/iceberg-python/blob/b7cf14e3b195064d8740b8f2ab306d8729e54dc2/pyiceberg/table/sorting.py#L171)
to support field look up by name.
@Fokko - does that sound like a good way to resolve this issue?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]