melgenek commented on issue #4495:
URL: 
https://github.com/apache/arrow-datafusion/issues/4495#issuecomment-1407497719

   I migrated `decimal.rs` 
https://github.com/apache/arrow-datafusion/pull/5086. I took the tests as is, 
and transformed them into `.slt`. The problem is that ordering in these tests 
is not defined. For example, [this 
test](https://github.com/apache/arrow-datafusion/blob/master/datafusion/core/tests/sqllogictests/test_files/decimal.slt#L91-L96)
 doesn't have an `order by` clause.
   ```
   query RRI?R
   select * from decimal_simple where c1 > c5;
   ----
   0.00002 0.000000000002 3 false 0.000019
   0.00003 0.000000000003 5 true 0.000011
   0.00005 0.000000000005 8 false 0.000033
   ```
   It seems that the fact that this test passes now in ci, and had passed 
before when it was in the `decimal.rs` is that the Datafusion implementation 
hasn't yet changed significantly enough to cause the order of the results to 
change.
   
   @jackwener wasn't this lucky with his union tests, and they eventually 
failed in the master branch 
https://github.com/apache/arrow-datafusion/pull/5095.
   
   I'd like to introduce some determinism to the decimal tests, and probably 
some other tests that don't have explicit ordering. My question is what is the 
best way to do this?
   
   I see that Datafusion uses both `rowsort` and `order by`. [DuckDB 
states](https://duckdb.org/dev/sqllogictest/result_verification#result-sorting) 
that it prefers an explicit `order by`. But, for example, 
[CocroachDB](https://github.com/cockroachdb/cockroach/search?q=rowsort) and 
[Risingwave](https://github.com/risingwavelabs/risingwave/search?p=1&q=rowsort) 
use `rowsort` quite extensively.
   
   Are there any guidelines for using or not using `rowsort` and `order by` in 
Datafusion?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to