alamb commented on issue #18863: URL: https://github.com/apache/datafusion/issues/18863#issuecomment-3562662845
> The results are changing every time I hit the query with 10 partitions. On a cursory look at the query and results, it seems like the query is non-deterministic in the sense that there are multiple correct answers for this query (both results you show are ordered correctly by "c") <img width="1048" height="811" alt="Image" src="https://github.com/user-attachments/assets/7b3870d2-9100-46b1-b0ff-6278b605b568" /> In terms of determinism, I suspect that comes from the order that the data is processed in different cores (basically which thread finishes first) -- you could test this theory by setting `target_partitions` to 1 and I expect the results to be consistent from run to run So TLDR is I think this behavior is expected -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
