pepijnve commented on PR #17813:
URL: https://github.com/apache/datafusion/pull/17813#issuecomment-3506492543
@alamb Sorry to keep poking you about this. I have a hard time letting
interesting problems go; they live in my head rent free. But I'll park this one
for good if you think it's really a dead end.
I had a look at `interval_arithmetic.rs` to see if `Interval` could be used
in this implementation. I don't think that's going to work. SQL's predicate
logic domain has three elements: true, false, and unknown (`T`, `F`, `U` for
short). These are three distinct elements. `Interval` models `U` as `{ T, F }`
but that's not really correct. `U != { T, F }` and it's also not `U = { }`. You
can't really use simple single intervals and interval arithmetic for general
sets where the elements have no ordering.
This matters for tests like `is unknown`, `is true`, etc. When computing
bounds bottom up you get the following results:
1. One of true or false:`{ T, F } is unknown -> { F }`
3. Any possible value: `{ U, T, F } is unknown -> { T, F }`
2. Certainly unknown: `{ U } is unknown -> { T }`
4. No possible values: `{ } is unknown -> { }`
You can see how those different possibilities can bubble up from simple
boolean arithmetic
- `{ T, F } and { U } -> { U, F }`
- `{ T, F } and { U, T } -> { U, T, F }`
- `{ T, F } and { U, F } -> { U, F }`
- `{ T, F } and { T, F } -> { T, F }`
- `{ T, F } and { F } -> { F }`
- `{ T, F } and { U, T, F } -> { U, T, F }`
With just a single `Interval` I don't see how you could distinguish between
these situations.
Now that being said, the current implementation isn't as precise as I
described above either. Would it make sense to go for a more rigorous set based
implementation or are you basically saying "don't bother, there's no chance
we'll even consider this"?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]