Re: [PR] feat: Add existing parquet files [iceberg-rust]

via GitHub Mon, 10 Feb 2025 22:30:01 -0800


ZENOTME commented on PR #960:
URL: https://github.com/apache/iceberg-rust/pull/960#issuecomment-2649905192


   > @Fokko I was looking to do the name mapping and more metrics in another 
pr. Would you rather I include it in this one?
   
   ```
   def pyarrow_to_schema(
       schema: pa.Schema, name_mapping: Optional[NameMapping] = None, 
downcast_ns_timestamp_to_us: bool = False
   ) -> Schema:
       has_ids = visit_pyarrow(schema, _HasIds())
       if has_ids:
           return visit_pyarrow(schema, 
_ConvertToIceberg(downcast_ns_timestamp_to_us=downcast_ns_timestamp_to_us))
       elif name_mapping is not None:
           schema_without_ids = _pyarrow_to_schema_without_ids(schema, 
downcast_ns_timestamp_to_us=downcast_ns_timestamp_to_us)
           return apply_name_mapping(schema_without_ids, name_mapping)
       else:
           raise ValueError(
               "Parquet file does not have field-ids and the Iceberg table does 
not have 'schema.name-mapping.default' defined"
           )
   ```
   
   For name map apply, it relies on the SchemaVisitorWithPartner and it will be 
introduced at https://github.com/apache/iceberg-rust/pull/731. Maybe we can 
skip the schema without field id now and support name map apply after #731 
complete.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat: Add existing parquet files [iceberg-rust]

Reply via email to