fenfeng9 commented on issue #50142: URL: https://github.com/apache/arrow/issues/50142#issuecomment-4743200628
Just my personal reading of this issue: I think there are two separate points here. First, `time32` / `time64` are Parquet `TIME` logical types, not `TIMESTAMP` logical types: https://parquet.apache.org/docs/file-format/types/logicaltypes/#time So I am not sure that `coerce_timestamps` is currently documented as applying to Arrow `time32` / `time64`. From the current implementation, this option seems to be applied to Arrow `TimestampType` only. Second, I think there is still a real issue with `version='1.0'` / `version='2.4'`. `NANOS` is treated as a Parquet 2.6 feature, so writing `time64[ns]` as `TIME(NANOS)` when an older version is requested looks wrong to me. About `metadata.format_version`: I don't think this is strong evidence by itself. The Parquet footer does not store the exact writer option like `2.4`. It stores the Parquet metadata `version` field, which is an integer value such as `1` or `2`. PyArrow reads version `2` as the latest supported 2.x version, currently `2.6`. So `format_version: 2.6` does not necessarily mean that `version='2.4'` was ignored. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
