mroeschke commented on PR #49210:
URL: https://github.com/apache/arrow/pull/49210#issuecomment-3885821437

   > I don't think we should change this for to_pandas_dtype() if not changing 
it for to_pandas() itself? (i.e. both should ideally match?)
   
   Ah sure, I think there should be consistency in both APIs.
   
   For `pyarrow.DataType.to_pandas_dtype()` I would interpret "Return the 
equivalent NumPy / Pandas dtype" as mentioned in the docs as the "closest 
representable" type for types with no equivalence like `date` types; therefore; 
`datetime64[s]` makes more sense to me compared to `datetime64[ms]` because of 
second resolution's capacity to represent a larger amount of dates that a 
`date32/64` might represent. This was at least our expectation in 
[cuDF](https://github.com/rapidsai/cudf/pull/21369)
   
   Conversely it appears `ms` was chosen in 
https://github.com/apache/arrow/pull/35656 due to parquet compatibility IIUC? 
And if so, does that dependence between pandas ops and parquet still exist 
today?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to