[I] Not able to create Pyiceberg table with Partition Spec using pyarrow schema [iceberg-python]

via GitHub Sun, 16 Mar 2025 23:33:18 -0700


heman026 opened a new issue, #1797:
URL: https://github.com/apache/iceberg-python/issues/1797


   ### Question
   
   Hi 
   I am reading an existing iceberg table using batch reader. The reader has 
pyarrow schema. When creating Table using this pyarrow schema, the fields have 
Field_id = -1 which is creating issue when having an partition Spec. Since 
field_id = -1, different row is assigned as partition column instead of 
intended one. 
   
   ```
   batches = catalog.load_table(source_table).scan().to_arrow_batch_reader()
   
   partition_spec = PartitionSpec(PartitionField(
           source_id = -1,
           field_id = 1000,
           name="event_date",
           transform=DayTransform(),
       ))
   
   catalog.create_table_if_not_exists(iceberg_table, batches.get_schema(), 
partition_spec=partition_spec,
                                      
properties={'downcast-ns-timestamp-to-us-on-write': True,
                                                  
PYARROW_USE_LARGE_TYPES_ON_READ: True})
   ```
   
   Can you let me know how to create partition spec when having pyarrow schema. 
Am I missing something.
   
   Let me know if you need more info.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Not able to create Pyiceberg table with Partition Spec using pyarrow schema [iceberg-python]

Reply via email to