Michael-J-Ward commented on issue #665: URL: https://github.com/apache/datafusion-python/issues/665#issuecomment-2099448911
It's probably related to this issue in arrow-rs: [Rust Interval definition is incorrect](https://github.com/apache/arrow-rs/issues/5654). Here's a [godbolt link](https://godbolt.org/z/jaMn36Ghc) demonstrating the "1 month becomes 1 nanosecond" example. (I based that on [a comment in a similar thread in duckdb-wasm](https://github.com/duckdb/duckdb-wasm/issues/1696#issuecomment-2047272977)). I would suspect that if all code paths use the same impl, then `datafusion-python` wouldn't notice it, but perhaps that's wrong, or maybe not all code-paths use arrow-rs? The error occurs in the pyo3 magic as we cross the `python -> rust` bridge. Notice the python side assert's `pyarrow.Scalar`, but then the rust-side receives a `datafusion::ScalarValue`. (aside: is this magic type conversion intentional?) python side: https://github.com/apache/datafusion-python/blob/67d4cfb847e42724319aeec9014889262e6ea58a/datafusion/__init__.py#L182-L185 rust side: https://github.com/apache/datafusion-python/blob/67d4cfb847e42724319aeec9014889262e6ea58a/src/expr.rs#L229-L232 The error is already present before the rust-method is invoked, adding print statements on both sides of the bridge: ```console converting: MonthDayNano(months=1, days=0, nanoseconds=0) converting: IntervalMonthDayNano("1") ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
