raulcd commented on issue #47051: URL: https://github.com/apache/arrow/issues/47051#issuecomment-3057091797
The following commit (https://github.com/apache/arrow/commit/6822857775bafc765b9e75a09e0b7470ce1a957b) had a successful [verify-rc-source-windows](https://github.com/ursacomputing/crossbow/actions/runs/15779481696) on the 20th of June. I've retried a new crossbow job with the same commit (https://github.com/ursacomputing/crossbow/actions/runs/16192885087/workflow#L46) today which [failed](https://github.com/ursacomputing/crossbow/actions/runs/16192885087/job/45712508289) with the same PyArrow Parquet related failures. This points to the failure being not related to a code change but a CI/build/toolchain related issue. There are some really minor dependency differences but the original main suspects like pandas/numpy are the same version on both cases. One difference I've found is that the first failure used an updated GH runner image and updated the MSVC compiler from `19.43.34808.0` to `19.44.35209.0` from Visual Studio 2022 Developer Command Prompt v17.13.2 to Visual Studio 2022 Developer Command Prompt v17.14.5. Using the [Microsoft STL](https://github.com/microsoft/STL/releases) 17.14 release. I found surprising that if this would be a compiler related issue this would only appear on the pyarrow parquet tests even though the error seems related to some data integrity issue for `uint8`. ``` left = array([224, 225, 226, 227, 228, 229, 230, 231, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,..., 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43], dtype=uint8) right = array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,..., 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43], dtype=uint8) ``` @pitrou @kou any idea of what I could look at? _Note_: I don't have a Windows machine to reproduce but could potentially set up one, it'll take me a little bit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org