EeshanBembi commented on issue #17539:
URL: https://github.com/apache/datafusion/issues/17539#issuecomment-3288688422
I did some digging into how other databases handle this and wanted
to share what I found.
Other databases consistently treat overflow as an error:
- MySQL (strict mode): "out of range" errors - this is SQL standard
behavior
- Oracle: "Numeric Overflow" exceptions
- SQL Server: "Arithmetic overflow error"
- PostgreSQL: Already mentioned in the issue
So yeah, DataFusion is definitely the outlier here by silently
wrapping.
I tested a few more cases locally:
SELECT 9223372036854775807 + 1; -- Returns: -9223372036854775808
(yikes!)
SELECT -9223372036854775808 - 1; -- Returns: 9223372036854775807
(also wrong)
The good news: I poked around the codebase and found that BinaryExpr
already has a fail_on_overflow field! It's just hardcoded to false
right now. The plumbing is basically there.
Also, Arrow already has all the checked arithmetic we need
(add_checked, multiply_checked, etc.). So this might not be as big a
lift as it first seemed.
Thinking about implementation:
- Could start with just multiplication since that's the most obvious
case
- Maybe add a session config flag for the transition period that
@Omega359 mentioned
- Then gradually expand to other operations
Happy to take a crack at this if no one else is already working on
it. Seems like a good mix of correctness improvement + leveraging
existing infrastructure.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]