fenfeng9 commented on issue #50142:
URL: https://github.com/apache/arrow/issues/50142#issuecomment-4743200628

   Just my personal reading of this issue:
   
   I think there are two separate points here.
   
   First, `time32` / `time64` are Parquet `TIME` logical types, not `TIMESTAMP` 
logical types:
   
   https://parquet.apache.org/docs/file-format/types/logicaltypes/#time
   
   So I am not sure that `coerce_timestamps` is currently documented as 
applying to Arrow `time32` / `time64`. From the current implementation, this 
option seems to be applied to Arrow `TimestampType` only.
   
   Second, I think there is still a real issue with `version='1.0'` / 
`version='2.4'`. `NANOS` is treated as a Parquet 2.6 feature, so writing 
`time64[ns]` as `TIME(NANOS)` when an older version is requested looks wrong to 
me.
   
   About `metadata.format_version`: I don't think this is strong evidence by 
itself. The Parquet footer does not store the exact writer option like `2.4`. 
It stores the Parquet metadata `version` field, which is an integer value such 
as `1` or `2`. PyArrow reads version `2` as the latest supported 2.x version, 
currently `2.6`. So `format_version: 2.6` does not necessarily mean that 
`version='2.4'` was ignored.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to